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Abstract 

Recent theoretical developments in cognitive psychology imply both a need and a possibility for 
methodological development. In particular, the theory of problem solving proposed by Allen Newell and 
Herbert A. Simon provides the rationale for a new empirical method that here will be called trace analysis. 
A detailed example is presented in which trace, anarysis-is-applied to human performance on a spatial 
reasoning task. The relations between trace analysis, on the one hand, and the psychometric ideas of 
measurement and standardization, on the other, are discussed. A non-psychometric approach to 
standardized testing, called theory referenced test construction, is proposed. The main idea of theory 
referenced .test construction is that test items should be validated against computer-implemented 
information processing models of the relevant cognitive functions. 
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1. On Methodology 

Mental life is invisible and Its expression in action is under voluntary, intentional control. The 
psychological sciences have been slow in accepting the methodological challenge posed by these two 
facts. Several evasive tactics have been tried. The first tactic was to observe mental life directly, by 
looking inward. The second evasive tactic was to decree that action itself is the object of study in 
psychology. Both of these tactics deny the tocessity of inferring mental events from observations of 
actions. In a third evasive move psychology was declared a part of the humanities, with the implication 
that interpretation of human behavior is necessarily, irrevocably subjective. While admitting the need for 
inferences this stance denies the possibility of imposing a discipline on those inferences, a discipline 
which makes rational discussion and intersubjective agreement possible. We now know that the evasive 
tactics of introspectionism, behaviorism, and humanistic psychology do not work; they were worth trying, 
but they failed. We are left with the sole option of tackling the methodological challenge of mental life 
head on. 

One might take the view that a scientist should attack significant substantive problems, propose 
interesting theories, and discover novel facts. If he 1 does, the methodological development of his science 
will take care of itself. Methodology perse is boring, unending fiddling with technicalities, an activity best 
left to the pedantic introvert who brings no creativity to his work. A real scientist worries about ideas and 
problems, not about methods. 

There are several mistakes hiding in this proud attitude. First, careful observation of scientific 
research by a knowledgeable and sympathetic observer like Toulmin (1972) has revealed that the 
knowledge transmitted by one generation of scientists to the next does not consist mainly of particular 
explanations, but, iritead, of the procedures by which explanations are constructed. There is, then, 
evidence that our methods are closer to the center of scientific knowledge than the traditional disdain for 
methodological work admits. Second, methodology has to be distinguished from the perfecting of 
measuring instruments. Methodology certainly deals with the accuracy of observations in general and the 
precision of measurements in particular. But the core topics of methodology are: the nature of evidence, 
forms of description, patterns of inference, boundary conditions on the validity of inferences, the design of 
explanatory procedures, and the standards by which particular explanations are judged. Third, scientific 



'For convenience I am using "he", "his", etc. to refer to both genders throughout the chapter. 
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breakthroughs are brought about by new methodologies as well as by new Ideas. One need only mention 
the electron mlcrocope, the carbon-14 dating method, and the cyclotron. Fourth, In the application of 
.science to practical concerns methods are often useful even In the absence of theory. For Instance, the 
method of ascertaining vertlcallty by suspending a weight from a string Is useful for building a house even 
In the absence of a theory of gravitation. Methodology Is essential both for the creation and the 
application of scientific theories. 

Methodological Innovation has not been a conspicuous feature of psychological research. The 
evasive tactics mentioned above discouraged serious thinking about how to Infer states of mind from 
observations of behavior. Methodological development was restricted to the design of new statistical 
procedures, and methodological knowledge became limited to knowledge about the proper application of 
such procedures^ But the cognitive revolution jGajjnej^^ 

psychologist's agenda. Cognitive psychologists are collecting new types of data In support of new types 
of theories. We need a new view of methodology, new concepts to replace the stale dichotomies that 
dominated methodological debate in the past (description vs. hypothesis testing, experimental vs. 
correlational, laboratory control vs. ecological validity, objective vs. subjective, research vs. application, 
standardized vs. clinical, etc.). 

In order to take a fresh look at the fundamental dimensions of psychological methods, consider the 
following formulation of the basic problem: Given a behavioral record of person P (at time t), infer a' 
description of P's mental state (at t). This formulation implies that the three fundamental dimensions of 
psychological methods are (a) the type of behavioral record to which a method applies (I. e., the Input), 
(b) the type of description of mental, states that a method generates (1. e., the output), and (c) the rules of 
Inference-or, In the 'ermlnology of Toulmln (1972)-the explanatory procedures that are used to construct 
the description, given the behavioral record (I. e., the transformation of the input into the output). 

With respect to input, we can distinguish between extensive and intensive methods. Extensive 
methods rely on relatively shallow analysis of a large number of performances, while Intensive methods 
rely on a deep analysis of a small number of performances (possibly just one; see Dukes, 1968). For 
Instance, the methods used by experimental psychology and by psychometrics are extensive, while the 
methods used by psychoanalysts are intensive. " Furthermore, behavioral records vary with respect to 
whether they preserve sequential information or not, and methods that do preserve sequential information 
vary with respect to the temporal density of thr.t Information. 
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Wtth respect to output, we can distinguish between singleton and aggregate descriptions. Singleton 
descriptions summarize observations that derive from a single individual, while aggregate descriptions 
summarize observations which derive from a group of individuals. For instance, psychometric and 
psychoanalytic methods produce singleton descriptor, while the methods used in experimental 
psychology typically produce aggregate descriptions. 

With respect to explanatory procedures, we can distinguish between open and closed methods. The 
purpose of open methods is to reveal the structure in the behavioral record, open methods proceed in 
bottom-up fashion from the data towards the description. The purpose of closed methods is to ascertain 
how closely the behavioral record fits a pre-defined structure. For Instance, the methods used in 
psychoanalysis are typically open methods, while the methods used in experimental psychology are 
_dosed_methods.. -The.psychcmetric •tradition-has •a-double-sided'*relation"to' thls"dlmarisl6rir ""The 
construction of tests use open methods liko factor analysis and cluster analysis, but the application of a 
test battery, once constructed, Is an instance of a closed method. 

In summary, I suggest that psychological methods should be discussed in terms of the type of 
behavioral records they apply to, what type of descriptions of mental states they generate, and what type 
of explanatory procedures they use to transform the record into the description. The rest of this chapter 
presupposes this schema for the analysis of methods. 

A major new type of behavioral record introduced into cognitive psychology in recent years 2 is that of 
protocols, in particular think-aloud protocols (Newell, 19t>S; Newell & Simon, 1972; Ericsson & Simon, 
1984; Williams & Hollan, 1J81; Williams & Santos-Williams, 1980). A protocol is a verbatim transcript of 
spontaneous talk on the part of a subject about a task. There are two frequently used methods for the 
processing of protocols. The simplest is the method of excerpts which has been practiced in the 
humanities for a long time. It consists in selecting a part of the corpus and printing it in full, thus letting 
the reader see for himself, as it were. The excerpt is selected so as to exhibit a typical case, to prove the 
existence of some phenomenon, or to make a point of some kind; frequently, two excerpts are shown side 
by side in order to illustrate a difference or a contrast. 

The other popular method for processing verbal protocols is known in social psychology as content 




protocols was not invented \n recent years, but rather re-dlscove.ed See the historical section in Ericsson and 
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analysis* (Holstl, 1968). In content analysis one proceeds by defining a set of categories of textual 
events and counting the frequency with which each category occurs in a corpus of protocols. These 
frequences can be used as dependent variables in experimental studies. Cognitive psychologists re- 
invented this method and have used it frequently in recent years, without, however, paying attention to 
the rather extensive experience of social psychologists with respect to its applicability, reliability, and 
validity (Holstl, 1968). 

Newell (1966) and Newell and Simon (1972) have proposed a new method for the analysis of 
protocols. They did not name their method; for convenience, I will refer to it as trace analysis. In the 
terms introduced above, trace analysis is an intensive, open method which alms for singleton 
descriptions. The type of behavioral record to which trace analysis applies is a think-aloud protocol. The 
-type of description produced is*aspecification dfWinfdrmatfon processing system that beh~aves"fike the 
observed person. The explanatory procedures that generate an information processing system from a 
think-aloud protocol are rather complicated; they will be presented below in the context of an example. 
Trace analysis breaks new ground in that it combines an interest in the meaning of protocol fragments 
(which is characteristic of "the method of excerpts) with a concern for imposing a discipline on the process 
of analysis (which is characteristic of content analysis). Also, it makes use of the sequential-information in 
a protocol, a type of iniormation which is destroyed by methods that build on category frequency. 

Trace analysis has been all but ignored. Today, sixteen years after its introduction, there exists, to 
the best oJ my knowledge, no published research report that uses it, other than the book in which it was 
originally introduced. One possible explanation for this fact is that the description of the method is 
somewhat obscure, and, moreover, buried in a single chapter of a large and rather difficult book (Neweli & 
Simon, 1972, Chap. 6). Another possible explanation is that Newell and Simon introduced trace analysis 
in the context of a specific application, namely a study of so-called cryptarithmetic problems. 4 Since 
human performance on cryptarithmetic problems is not a hot substantive topic researchers might bypass 
Neweli and Simon's study as not relevant to their interests, thus missing the methodological contribution 
of that study. Also, researchers might fail to distinguish between different types of protocol analysis. 
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Researchers who use either the method of excerpts or the method of content analysis may believe that 
they are using the method proposed by Newell and Simon, and consequently feel no need to study their 
original description of trace analysis. Yet another possible explanation is that trace analysis breaks io 
radically with the methodological traditions of academic psychology that it simply has not been 
understood. 

The purpose of this chapter is to develop the implications of trace analysis for standardized testing, 
and to facilitate and promote wider discussion and use of trace analysis in both research and practical 
contexts. The introduction to trace analysis presented here is, I believe, more accessible than fhe original 
presentation by the inventors of the method. Also, the task domain chosen for the applicatlon-verbally 
presented spatial reasoning problems-is different enough from cryptarithmetic to provide some evidence 
jfor.the.generaIity.of.the.method 

The chapter is organized as follows. Section 2 puts forth the rationale of trace analysis. Section 3 is 
devoted to cn application of trace analysis to spatial reasoning. Section 4 contains a speculative 
proposal for a non-psychometric methodology of standardized testing that builds on trace analysis. 

2. The Enaction Theory and Trace Analysis 

Allen Newell and Herbert A. Simon have proposed that we think by mentally enacting alternative 
sequences of actions with respect to a problem (Newell, 1966, 1980, 1987; Newell, Shaw, & Simon, 1958; 
Newell & Simon, 1972). Although they did not name their theory, I have called it the Enaction Theory in 
other contexts (Ohlsson, 1983) and I will continue to do so here. The main methodological implications of 
the Enaction Theory are that cognitive diagnosis should be based on a sequentially ordered and 
temporally dense trace of the performance to be diagnosed, and that a diagnostic description should take 
the form of a specification of an information processing mechanism that can reproduce the observed 
performance. Think-aloud protocols fulfill the methotfc'ogical requirements better than other types of 
behavioral records. Trace analysis is primarily a method for the analysis of Jhink-aloud protocols. Both 
the Enaction Theory and the method of trace analysis are described below. 
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2.1. The Enaction Theory of Thinking 

The Enaction Theory asserts that cognitive processing takes the form of heuristic search through a 
problem space. The process of heuristic search consists h using a strategy, i. e., a collection of problem 
solving heuristics, in order to decide which operator, i. e., cognitive skill, should be applied to to the 
current knowledge state, i. e., mental representation of a problem. The application of an operator 
generates a new knowledge-state. The successive application of operators continues until a knowledge- 
state is reached in which the problem solver's goal is satisfied. These concepts may need some 
clarification. 

G'onstuer a person confronted with an intellectual task, such as the Tower Manoi puzzle, a chess 
problem, an algebra problem, Maier's Two-String Problem, or a geomf ' problem. In order to 
solve the task he must construct a mental representation of the given information, the problemas- 
presented. The internal description of the problem is called the initial knowledge state. For instance, in 
the Tower of Hanoi puzzle 5 the problem-as-presented can be seen as a pyramid of discs; in a verbal 
reasoning'task the givens might be conceptualized as a list of rotated facts. The problem solver must 
also build a mental representation -of what he is supposed to do with the task, i. e., of what counts as 
having solved it. This representation is his goal. The goal specifies when to terminate the problem 
solving effort. For instance, in the Tower of Hanoi puzzle the goal might be conceptualized as transport 
the pyramid of discs to another peg. The initial knowledge state and the goal together constitute an 
understanding of the problem. 

Once the iask has been understood, ihe thinker must call up a repertory of mental actions or cognitive 
skills with which he can process the problem. They a-fs called operators, because they operate upon the 
current mental representation of the problem to generate a new representation (namely a representation 
of what the problem situation would be like if the physical action corresponding to the operator were to be 
carried out). The application of operators is a mental, rather than a behavioral, process. The theory 
asserts that the thinker is acting out In his mind what would happen if such and such an action were to be 
taken with respectto the problem. For instance, in solving a chess problem the thinker is likely to imagine 
what would happen, if he were to make such and such a move; in an algebraic proof problem, the thinker 
might anticipate what a particular formula would look like, if a certain transformation Were applied to it. 



M l Q hSo^o F ^!. and , W « MS O l d u ,f0 I 0nt SiZ8S ! Stacked °" 009 of 1,10 P *9 S in order of increasing size, move the discs to another 
peg by moving one disc at a time, without ever putting a larger disc on a smaller (Simon ,1975). 
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The theory claims that the problem solver at any one time considers only a small ensemble of operators 
that he has judged as relevant for his current problem. The problem solver may or may not be correct in 
his relevance judgments, so the operator ensemble may or rray not include all operators necessary to 
solve the problem. 6 

The initial knowledge state and the repertory of relevant operators (or the operators the problem 
solver believes are relevant) implicitly specify a space of solution candidates to the problem, known as the 
problem space. 7 A solution consists in the application of some operator to the initial state, then another 
(not necessarily distinct) operator to the resulting state, then yet another operator to its result, etc., until 
the goal has been reached. A solution candidate consists in a sequence of operator applications, known 
as a path through the problem space. For instance, pick up the hammer, tie the hammer to one of the 
ropes, set the rope swinging, walk over to the other rope, grab the first rope as it comes swinging, untie 
the hammer, and tie the ropes together Is a sequence of steps which constitutes a solution to Mater's 
Two-String Problem. 8 The initial state and the repertory of operators together generatively define the set 
of all possible solution candidates. The Enaction Theory asserts that thinking consists in the mental 
exploration of this set. 

In routine action the sequence of operators that lead to the goal is known beforehand. For instance, 
in solving a multi-column addition task, any competent adult knows to begin with the column to the right, 
add within a column, carry to the next column to the left, etc. Such a task is not properly called a 
problem. A task is a problem when the solution path is not known beforehand, but has to be found by 
trying out various operator sequences, judging how promising they are, and selecting one for execution, 
if the selected action sequence does not, in fact, lead towards the goal, the problem solver has to go back 
and try a different sequence, a process that naturally enough is called back-up. The process of exploring 
alternative paths Is called search. The search is anticipatory; we search in the head before we search in 
the flesh, as it were, a decision making technique that has considerable survival value. 

A problem space can be searched systematically, by exploring all possible paths. But simple 
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•This principle has been used to explain the phenomena of restructuring and insight fn problem solving (Ohlsson, 1984, c). 

7 The terminology chosen by Newell and Simon is unfortunate on this point. "Solution space" would have been more descriptive 
than problem space". Grave misunderstanding of the theory results If a problem space Is construed as a space of problems 
Instead of as a space of solution candidates for a particular problem. 

u 'H?!?! 09 *?, afe f J8 P end ** from *** ceiRn 9; toe distance between them Is too wide to allow a person to reach one rope while 
holding the other. A variety of everyday objects is provided. The task is to tie together the two ropes (Maier, 1 970). 
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combinatorial calculations will show that the number of possible operator sequences is astronomical, 
even If the repertory of actions is small and the length of the solution path short. For instance, if there are 

5 relevant operators and if the solution path Is 10 steps long, then there are 5 to the 10th power, or 
approximately ten million, different solution candidates. Systematic search Is not feasible. Instead, the 
Enaction Theory claims, problem solvers search selectively, applying rules of thumb called heuristics. 
Such a rule contains information about which operator is most likely to lead towards the goal in some 
particular type of situation. For instance, a useful heuristic for geometry proof problems is if the task is to 
prove two geometric objects congruent, and if the given figure contains many straight lines, try to find 
congment triangles. A problem solving strategy consists of a collection of such rules. The efficiency of 
problem solving is a function of how accurately the available heuristics sort out blind alleys and focus the 
search on a path that leads to the goal. The Enaction Theory explains expert performance in knowledge- 
rich domains (Newell & Simon, 1972, Chap. 11-13) as a product of a large number of very selective 
heuristics. 

The Enaction Theory is a successful theory. The notion of heuristic search through a problem space 
has now been articulated with respect to a wide range of human behaviors, from syllogistic reasoning 
(Newell, 1980) to the configuration of computers (Rosenbloom, Laird," McDermott, Newell, & Orciuch, 
1985). The theory explains why some problems are more difficult than others (see, e. g., Kotovsky, Hays, 

6 Simon, 1985). It explains individual differences in thinking (S9e, e. g., Newell & Simon, 1972, Chaps. 7, 
10, and 13). During recent years the Enaction Theory has been the basis for several theories of learning 
(see the collections of articles edited by Anderson, 1981; by Bole, 1987; and by Klahr, Langley, & 
Neches, 1987a). The Enaction Theory carries definite implications for education (Frederiksen, 1984; 
Ohlsson, 1983; in press); indeed, it is solid enough to support the design of intelligent tutoring systems 
(Anderson, Boyle, & Reiser, 1985). There is at the current time no other theory of human thinking with 
comparable scope, precision, empirical grounding, and practical utility. 

2.2. The Method of Trace Analysis 

If the Enaction Theory of thinking is correct, what kind of empirical method do we need in order to 
explain particular problem solving performances? The theory implies that a psychological explanation 
consists of three parts: An hypothesis about the subject's problem space (his understanding of the 
problem, and the mental resources he has available for processing it), an hypothesis about his solution 
path (the sequence of mental states he traversed on his way to the goal), and an hypothesis about his 
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strategy (the collection of heuristics that generated the solution path). The empirical observations we 
collect and the procedures by which we analyze them must enable us to identify those three constructs. 

Newell and Simon (1972) proposed that think-aloud protocols is an ideal type of behavioral record for 
the study of problem solving, and they invented trace analysis? as a method for the processing of such 
protocols. The main methodological works on trace analysis are Newell and Simon (1972, Chapter 6) 
and cricsson and Simon (1984). Trace analysis proceeds in a bottom-up fashion through three main 
steps: 

1. Construct the subject's problem space: (a) infer his mental representation of the task from 
the words he uses to describe the problem; (b) infer his ensemble of operators from 
recurring patterns of activity that give rise to new conclusions; and (c) infer his goal by 
noticing when, under what conditions, he declares himself finished with the task. 

2. Identify the subject's solution path by making use of the sequential information in the 
protocol in order to map it onto the problem space identified in step 1. This amounts to 
choosing a path through the problem space which explains as many of the events in the 
protocol as possible. 

3. Hypothesize the subject's strategy by inventing problem solving heuristics that can 
reproduce his solution path. The strategy h-pothesis is complete if for each state-step pair 
along the solution path, there is some heuristic in the siraiegy that can generate that step 
when applied in that state. 

The description of the subject achieved with this method consists of a problem space and a strategy for 
how to search that space. The description of his performance consists of a solution path. 

The three steps described above build on each other: Identification of the problem space enables the 
description of the solution path, and a description of the solution path enables identification of the 
heuristics. Only the first two steps build directly on the information in the data.. The step of identifying the 
problem space makes use of the content of the protocol utterances, while the step of laying out the 
solution path builds on the sequential information in the protocol. The third step, however, builds on the 
previous two steps. The problem solving heuristics used by the subject are inferred from the solution 
path, not from the protocol. In summary, the problem space constitutes a special-purpose formalism for 
describing the solution path; the solution path is a low level mini-theory which explains the behavioral 



•The name "trace analysis" is preferred over "protocol analysis", since I do wan t to imply that the method invented by Newell 
and Simon is the onVpossiWe method for the analysis of protocol*. y 
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record; the strategy Is slightly-higher-level mini-theory that explains the solution path. 10 

The Enaction Theory implies two methodological requirements that are difficult to fulfill with any other 
type of behavioral records than think-aloud protocols. The first requirement is that the behavioral record 
must enable us to infer the subject's conceptualization of the problem. We therefore need to hear him 
talk about the problem. How does he parse the problem situation into distinct objects, what properties 
does he assign to them, and what relations does he see between them? What representational formats 
does he use to encode those properties and relations? For instance, In so-called cryp*arithmetic 
problems, the concept of parf/y-whether a number is odd or even -is often crucial to successful problem 
solving (Newell & Simon, 1972). It is obviously difficult to know whether a person is using the concept of 
parity or not, unless we hear him talk about the problem. As a second example, Johnson-Laird (1983) 
has argued that people solve verbal reasoning problems with mental models, rather than with 
proposiflonal representations, It is obviously difficult to know what representational format a person is 
using, unless we hear him verbalize it 

The second methodological requirement of the Enaction Theory is that the behavioral record must 
enable us to infer the sequence of mental events that took place when the subject solved the 
experimental problem. Unless we know the solution path, we cannot Infer the strategy. Different paths 
might lead to the same end-state, so a recording of the end-state or the time it took the subject to arrive at 
the end-state does not enable us to identify his path. We need to observe the intermediate stages of the 
problem solving effort, the sequence of partial results created along the path to solution. The trace of the 
partial results should preferably be temporally dense, i. e., have many observations of the performance 
per unit of time, in order to accurately discriminate the subject's path from alternative paths through the 
problem space. 

Think-aloud protocols fulfill both of the above requirements. They reveal how subjects conceptualize 
the experimental problem, and they provide a sequentially ordered and temporally dense trace. Other 
types of behavioral records are less satisfactory. Interviews destroy sequential information, because the 
order of the subject's utterances is partially controlled by the order of the interviewer's questions; In 
retrospective interviews the sequential information is further corrupted by memory failures. In general, 



'°The hierarchy of explanations does not end with the strategy, of course. The strategy is explained by a learning theory, which, 
in turn, is explained by the structure of the cognitive architecture; the latter is related to the structure of the brain; and so on. 
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interviews reveal the subjects's representation, but does not enable us to infer his solution path. Video 
tapes of behavior or the recording of key strokes on computer terminals provide sequential information, 
but they do not give us any insights into the subject 1 s mental representation. In general, behavioral 
recordings reveal the path, but not the representation. Eye movement recordings may reveal the 
representation (since they tell us what features of the problem situation the subject attends to, or can 
discriminate between), but since they do not reveal what the subject does with the problem information, 
they do not enable us to infer the solution path. In short, think-aloud protocols fulfill the methodological 
requirements of the Enaction Theory better than other types of behavioral records. 

In summary, human beings are hypothesized to think by mentally exploring alternative paths through 
some search space. The methodological implications of this hypothesis is that cognitive diagnosis should 
be based on a sequentially ordered and temporally dense behavioral record that is analyzed with the goal 
of designing an information processing mechanism that can reproduce the observed behavior. A 
concrete example of this kind of cognitive diagnosis is worked out in detail in the next section. The 
implications of this methodology for the construction of standardized tests are developed in the fourth and 
final section. 

3. Trace Analysis Applied to Spatial Reasoning 

Consider the spatial reasoning problems in Figure 3-1. Each problem consists of a short text 
describing a static situation by asserting certain spatial relations between some discreet, stable objects. It 
ends with a question concerning a relation not explicitly mentioned in the text. I call problems of this sort 
spatial arrangement problems. The relational concepts used are common sense spatial concepts. 11 
They include unary predicates like "bottommost", tertiary predicates' like "between", and ambiguous 
predicates like "adjacent". If the number of objects in such a problem is larger than three, It will usually 
take an adult more than a minute to solve that problem; if the number of objects is, say, ten, and if the 
relational structure embedded in the premises is complex, the solution time can be as long as 20 minutes. 

From a problem solving point of view, spatial arrangement problems are unusual in that they are 
static. Many problems used to study problem solving require a sequence of transformations of the given 
situation. In a spatial arrangement problem, on the other hand, the task is not to transform the given 



ERLC 



"The problem texts are translated from Swedish. Phrases Ike "bottom-most but one" and "frontmost" may not be good English, 
but their Swedish counterparts are quite Idiomatic. 9 cng.isn, 
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1. the Bench Problem 

Some boys are sitting on a bench. 

Jonas Is further right than Ingvar. 

Olof is further left than Ingvar. 

David is immediately to the left of Jonas. 

Who is Immediately to the right of Ingvar? 



2. The Block Problem 

A child is putting blocks of different colors on top of each 
other. 

A black block is between a red and a green block. 
A yellow block is further up than the red one. 
A green block is bottommost but one. 
A blue block is immediately below the yellow one. 
A white block is further down than the black one. 

Which block is immediately below the blue one? 



3. The Icecream Problem 

Some boys are standing in line at an ice-cream stand. 

Rolf i3 further towards the front than Erik. 
Sven te further towards the front than Ove. 
Nils is immediately behind Mats. 
Hans is frontmost but one. 
Mats is further back than Ove. 
Erik is immediately behind Hans. 
Leif is further back than Mats. 

Who Is immediately behind Erik? 



Figure 3-1 : Examples of spatial arrangement problems. 
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situation, but to understand it well enough to answer a question. From a psychometric point of view, 
spatial arrangement problems would be expected to have high loads on spatial ability, reasoning ability, 
and verbal ability. A main difference between spatial arrangement problems and typical test items is that 
spatial arrangement problems take more time to solve. 

Empirical studies of spatial arrangement problems, using both trace analysis and experimental 
methods, have revealed a number of phenomena: 

• A majority of adults solve spatial arrangement problems with the help of a mental model, 12 
rather than by reasoning exclusively in a prepositional mode (Hagert, 1980a, 1980b; 
Johnson-Laird, 1983; Ohlsson, 1980a, 1984b). 

A small minority of adults use a propositional reasoning method based on the idea of 
elimination of alternatives (Ohlsson, 1 980a, 1 984b). 

An even smaller minority try to apply other, less rational approaches to the problem, such ?«s 
trying to infer the quantitative distances between the objects (Ohlsson, 1980a). 

• The particular problem spaces used to implement the mental model building strategy vary 
from one individuai to the next, as do the heuristics -ised to search them, with substantial 
differences in the solution paths traversed by different persons as a consequence (Ohlsson, 
1980a, J 980b, I982). 

• Some subjects shift back and forth between model-building and propositional strategies. 
Subjects can be induced to make such strategy shifts, even when they do not show any 
spontaneous tendency to do so (Ohlsson, 1984a). 

• Strategies for spatial arrangement problems have a large attention allocation component. 
The solution to a spatial arrangement problem depends crucially upon which premises are 
read in which order. Consequently, differences in attentions heuristics is a major source of 
individual differences in this task domain (Ohlsson, 1984b). 

• The spatial competence needed to solve spatial arrangement problems is large. A list of the 
inferences needed to build mental models of Nnearordering*: from propositional descriptions 
contains over one hundred distinct inference patterns (Ohlsson, 1980a). 

• Backups are frequent events in problem solving efforts in. this domain. However, a large 
proportion of backups are not followed by the exploration of new search paths, but by the 
re-traversal of the already explored search path (Hagert & Rollenhagen, 1981; Ohlsson, 



*The term mental model is here used in the sense of Johnson-Laird (1983), who defines a model as an object which satisfies a 

^l^IT.^ 0 " 8 ; il f th ° T" ^ the ,Wm fe USed ln ,lw of iomai ,0 9 ic - 1,18 lerm is commonly used within 
2£fif T*** refef 10 any '2 ,egra,ed k r^° unil wHh s *<9° 9 rain particularly if it encodes knowledge about a 
physical mechanism or process. For examples of this alterative use of the term, see the collection of articles by Centner and 
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1980a). Hence, they are not backups In the search theory sense. These backups occur, I 
believe, because working memory capacity limitations make it necessary to recreate 
intermediate results from time to time. I will call backups which are followed by repetition of 
previously performed inferences consolidation backups. 

The general conclusions summarized above are based on large numbers of applications of trace 
analysis. For example, Study II of Ohlsson (1980a) was based on fifty protocols, each of which was 
analyzed with the help of trace analysis. A detailed diagnosis of a single performance will be presented in 
detail. 

3.1. The Subject and the Behavioral Record 

The performance to be diagnosed here was selected from a larger study (Ohlsson, 1980a, Study I). 
Twelve subjects participated in the study. They solved a variety of spatial arrangement problems under 
different conditions. The protocol to be discussed here was produced by a subject labeled SI6 while 
solving the Block Problem (see Figure 3-1). It was selected for analysis on the basis of completeness 
and interest. 

Subject SI6 was a 30 year old psychology student. She participated in the experiment as part of a 
course requirement. She was not paid. The Block Problem was her third problem in the experimental 
session. In a previous session she had solved three simpler spatial arrangement problems. 

The problem text was typed as it appears in Figure 3-1 on a white index card which was handed over 
to the subject at the beginning of the solution attempt. She had the card available throughout the solution 
attempt. She was not allowed the use of paper and pencil or any other tool. She was instructed to think 
aloud. The exact instruction given was "give words to your thoughts as you have them". She was 
instructed to begin her solution attempt by reading through the problem text aloud. The verbalizations 
.were tape recorded and transcribed verbatim. 

The complete protocol 13 is shown in Figures 3-2 and 3-3. F-numbers in the following analysis refer to 
protocol fragments in those figures. The protocol is 3:40 minutes long (220 seconds), including the initial 
reading of the problem text. It contains a total of 314 words, which means that the subject's speech rate 
was approximately 1.4 words per second. There are no task- irrelevant passages in the protocol, nor any 
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F1 . a child puts blocks in different colors on top of each other 

F2. a black block is between a red and a green block 

F3. a yellow block is further up than the red one 

F4. a green block is bottommost but one 

F5. a blue block is immediately below the y?llow one 

F6. a white block is further down than tiro black one 

F7. what block is immediately below the blue one 

F8. the black block is between a red and a green 

F9. block 

F10* well that does not mean that it must be exactly between 

F1 1* could be something else between also 

F12. a yellow block is further up than the red one 

F1 3. a green block is bottommost but one 

F14. a blue block is immediately below the yellow OiU? 

F15. the yellow one is higher up than the red one 

F16. and immediately below the yellow one comes the u/ire one 

F17. then comes a red one 

F18. Tdsay 

F19,well 

F20.a 

F21. a yellow block Is higher up than the red one 

F22. a green block is bottommost but one 

F23. a blue block is immediately below the yellow one 

F24. below the yellow one is a blue block 

F25* and a yellow block is higher up than the red 

F26. below the yellow is then also a red 

F27. a blue and a red are below the yellow one 

F28. and a 

F29. a blue and a red are under the yellow block 
F30. and a green block is bottommost but one 
F31. a black block is between the red and the green 
F32. a black block 

F33. a black block is between the red and the green block 
F34* a white block is further down than the black one 
F35. then there is a white 
F36. and then we have a 
F37. oh how difficult 

F38. a white block is further down than the black one 
F39. and the black one is between the red and the green 



Figure 3-2: Think-aloud protocol from SI6 on the Block Problem, Part 1 
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F40. white red black green 
F41. I'll say then 

F42. but the green is bottommost but one 

F43. then 111 say 

F44. white green black red 

F45. instead 

F46. then the white one is bottommost 
F47. white green black and red 
F48. and then we had the 

F49. blue one which is Immediately below the yellow 
F50. the yellow is higher up than the read 
F51 . then it is topmost so far 
F52. the yellow one 

F53. and the blue one is immediately below 

F54. then It comes topmost but one 

F55. which one is then immediately below the blue one 

F56. immediately below the blue one is then the red one 
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Figure 3-3: Think-aloud protocol from subject SI6 on the Block Problem, Part 2. 
interactions with the experimenter. The solution attempt ended when the subject gave her answer, which 
was correct. 

3.2. Diagnosing the Subject's Problem Space 

The problem space used by the subject is discussed in four subsections, dealing with her 
representation, her operations, her goal, and her memory resources, respectively. 

Representation 

The protocol shows that, as one would expect, the subject is capable of reading and 
comprehending the sentences in the problem text, and of making use of the prepositional information 
conveyed by them. However, there are several classes of prepositional constructions which are not used 
by this subject in this protocol. First, there are no examples of negated sentences in the protocol. SI6 
does not use expressions of the form "Object X is not ...", e. g., The black block cannot be above ... 
Second, there is no evidence for the use of quantifiers. She does not use expressions of the form "All 
objects are ..." or "At least on object is ...". Third, she does not use any sentential connectives (even 
though she uses "and" to connect arguments within propositions). In particular, she does not use any 
If-then constructs, such as "consequently", "therefore", "it follows that", etc. in summary, simple 
predicate-argument constructions are sufficient to capture the subject's representation of propositional 
information about the task. 
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<k»owladga-stata> : :« <knowladga-alemant> / 

<knowladga-ala»ant> <knowladga-state> 

<knowladga-alaraant> <tag> <knowladga-alamant> / 

<proposition> / <qua«tion> / <model> 

<propo«itioa> : (<pradicata> <objact-«aquance>) 

<qua«tion> : :« (<pradicata> ? <objact>) 

<pradicata> : :« AEOVE / IMMEDIATELY- ABOVE / 
UNDER / I4MCDIATELT-UNDER / 
TOPMOST / TOPMOST-BUT-ONE / 
BOTTOMMOST / BOTTOMMOST-BUT-ONE / 
ADJACENT / BETWEEN / ANSWER 

<modal> (<and-anchor>.l <alamant-saquanca> <and-anchox>.2) 

<and-anchor> : :« TOP / BTM 

<alamant-saquanca> : :« <alainant> / <alament> <alament-saquanca> 
<tltntnt> : :* <objact> / <relation> 

<objact-«aquanca> : <objact> / <ob ject> <objact-sequanca> 

<objact> : :« rad / black / whita / graan / yallow / blua 

<ralation> : <£ollowad-by> / <adjacant-to> 

<£ollowad-by> "blank space" 

<adjacent-fco> : :» "colon 1 ' 

<tag> old / new / unc / imp 

<proba> FIRSTPREM / SECPREM / THIRDPREM / 

FOURTHPREM / FIFTHPREM / NEXTPREM / QUESTION 

<oparator> : :« READ / TRNS / INT / ANSW 

Figure 3-4: Mental representation of subject SI6 for the Block Problem. 
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There Is evidence In this protocol (as well as In other protocols from this subject) that the prepositional 
format Is not the only one used by SI6. In three places (F40, F44, and F47) she verbalizes her 
knowledge of the problem situation through a list of object-names, e. g.: 

F40. white red black green 

I take this as evidence that she is building a mental model of the problem situation, trying to see In her 
mind's eye the six blocks standing on top of each other. 

An mental model of a linear ordering can be represented as a list of object symbols. Two refinements 
are needed to accurately represent this subject, namely end-anchors and a distinction between "adjacent" 
and "followed-by". First, the subject reads out her mental model In different directions at different times 
during.the solution attempt (from top to bottom in F15-F17, and from bottom to top In, e. g., F40). This 
implies that her representation contains some device which allows her to keep track of the direction of a 
model. I will assume that she does this with the help of end-anchors, I. e., symbols which label the top 
and the bottom of the ordering respectively. In the formal model these are represented by the arbitrary 
symbols TOP and BTM, respectively. 

Second, the subject is able to Infer from premise 2 ("A yellow block Is further up than the red one") 
and premise 4 ("A blue block Is Immediately below the yellow one") that the red block Is below the blue 
block (see fragments F15-F17). This conclusion does not follow unless a distinction is made between two 
different relations, namely "x Is adjacent to y", which Implies that there Is no object between x and y, and 
"x Is followed by y", which does not say anything about proximity. Hence, the subject's mental -radel 
must contain some device for distinguishing between these two relations. In the formal model "adjacent 
to" is symbolized by a hyphen, and "followed by" with a blank space. For Instance, (TOP x-y BTM) means 
that y Is below and adjacent to x, (TOP x y-BTM) means that y is somewhere below x, that there could be 
other objects between x and y, and that there are no objects below y. 

It will be necessary to assume that the various kinds of knowledge elements used to represent the 
problem have different modes. These modes will be symbolized in the analysis with the help of Indices or 
tags. I will assume that the subject can tag knowledge elements In four different ways: 

new a new result (I. e., an output from an operator); 

0,d information which has already been used as basis for an inference; 
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unc a result which Is experienced by the subject as undear; 

toP a result which Is Impossible because it contradicts the given information. 

The evidence for the "new" and "old" tags Is indirect. It consists In the gfobal observations that SI6 
always works on newly produced information, and that old information never confuses her or interferes 
with her processing. Vhe evidence for the "unclear" status is more direct: In fragment F18 (see Figure 
3-2) the subject directly verbalizes uncertainty about an outcome. The evidence for the "imp" tag, finally, 
is also direct: In the course of solving the problem the subject discovers a contradiction which leads her 
to revise her model; the fragments F42-F45 show that she is aware of this contradiction. 

There are some types of, information which a'e not used by SI6 in her solution to the Block Problem. 
First, she does not think about the absolute positions of the objects, in contrast to the relative positions 
the objects acquire In a partially completed model. For example, she does not ask herself questions like 
"What object goes into the topmost position?" or "What position should be assigned to object so-and- 
so?". Her representation is relative and topological in character, rather than absolute and positional. 

A second and related point is that Si6 makes no use of numerical Information. There is no evidence 
that she thinks in terms of number of objects: how many objects there are all in all. how many objects she 
has left to place, how many objects there could be room for in such-and-such a part of the model, etc. 
.Indeed, there Is no evidence that she ever counts the total number of objects mentioned in the problem. 
(This raises the question of how she knows that she has completed her mental model.) 

Third, there are no verbalizations of goals, plans, or Intentions. She never says anything about what 
she is trying to do. or what she would like to be able to do. p. g.. "Next. I should find out the position of 
object X" or "I now want to find the object that is adjacent to object X". 14 

The representational format used by this subject on this type of task is summarized In a generative 
grammar on BNF form 15 In Figure 3-4. 

Operators 
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the subject shows evidence of using four basic problem solving operators (mental processes which 
produce new results): reading the problem text (READ), translating prepositional information into a 
mental model (TRNS), extending an existing mental model by Integrating further propositional information 
into It (INT), and answering a question by reading off the anrwer from a mental model (ANSW). They are 
defined in Figure 3-5. 

It is worth emphasizing that the READ process is included among the problem solving operators. In 
this analysis reading new information from the display counts as a step forward in the problem space. 
This implies that a model of the subject's strategy must include assumptions about when and how she 
w attends to the problem text Heuristics for how to access the problem text play an important part in 

understanding human performance in this task domain. 

The subject's world knowledge, or spatial competence, enters into the processing mainly through the 
TRNS, INT, and ANSW operators. They generate new conclusions. In order to model the subject's 
performance we need to know which spatial inferences these operators are capable of, i. e., what 
inferential competence we should stock them with, as it were, in order to accurately simulate human 
behavior. Task analysis indicates -that there are approximately one hundred distinct inferences about 
linear orderings which adults in our culture would consider valid (Ohlsson, 1980a). The analysis of the 
inferential competence of this subject will not be pursued further here. 

Goal 

The goal of solving a spatial arrangement problem is to answer the question at the end of the 
problem text. It is trivial to answer questions about a linear ordering, if one has access to a complete 
model of that ordering, i. e., a model which includes all the objects mentioned in the problem text. I will 
assume that the operative goal of SIS was to achieve a complete mental model. The evidence for this is 
that as long as her model is incomplete, she does not read the question she is supposed to answer. 
However, as soon as her model is complete in the sense of containing all the objects, she attends to the 
question and answers it. 

How did the subject decide when her mental model was complete? Logically speaking, there are only 
two possibilities: to check that "each object mentioned in the problem text is included in the model, or, 
alternatively, to count the objects in the model, count the objects mentioned in the text, and verify that the 
counts are the same. SIS does not show evidence of carrying out either process. The protocol contains 
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READ(<probe>) Read from the problem text that item which is specified by the probe. This operator 
accesses the external display, and delivers a proposition Into working memory. The 
proposition Is tagged as new, even If It has been read before. The probe is a 
description of that which Is to be read. In the formal model, the probe can take the 
values RRSTPREM, SECPREM, ... , etc., NEXTPREM, and QUESTION. These 
symbols are arbitrary, but their intended Interpretation should be obvious. . 

TRNS(<proposltion>) 

Translate a proposition into a mental model. This operator takes a proposition as 
input, and delivers into working memory a model which satisfies that proposition. The 
proposition is lagged as old (given that the operator is successful), and the model as 
new. For instance, if the sentence The blue block is immediately below the yellow 
one" (Premise 2 in the Blocks Problem) corresponds to the proposition "(Adjacent- 
Below blue yellow)", then TRNS[(Adjacent-Below blue yellow)] will result in the 
creation of the working memory element "(TOP yellow-blue BTM)\ 

INT(<proposition>) Integrate a proposition into the current model. This operator takes a proposition as 
input, and tries to integrate its information into the current mental model. If it 
succeeds, it produces a new, extended model which is tagged as new and placed in 
working memory. The proposition is tagged as old. The previous model is deleted 
from working memory. For instance, if the current model is "(TOP yellow-blue BTM)", 
then INT[(Further-Below red yellow)] results In the extended model "(TOP yellow-blue 
red BTM)". 

ANSW(<questfon>) Answer question. This operator compares the question and the current mental 
model, and reads off the answer, if possible. The answer is then said, and the 
solution attempt ended. For instance, if the current model is "(TOP yellow-blue red 
BTM)", then ANSW[(Adjacent-Below blue ?)], where "(Adjacent-Belqw blue ?)" 
corresponds to the question "Which object is immediately below the blue block?", wiir 
result in the answer "red". 



Figure 3-5: Basic problem solving operators of subject SI6. 
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no due as to how she knows that her mental model Is complete. Recognition of a complete model does 
not seem to be an explicit inferential step. I will assume that she infers that her model is complete when 
she fails to find any missing objects, i. e., any objects not yet included in the model. There is direct 
evidence for a process which tries to locate missing objects (see below). 

Memory resources 

A model of SI6's reasoning must make some assumptions about her working memory capacity and 
about her use of long-term memory. First, what working memory capacity should be presupposed? It 
turns out that a good account of the protocol can be constructed if we assume that this subject can 
reliably hold three knowledge elements in her head at any one time. (What counts as a knowledge 
element Is defined by Figure 3-4.) 

Second, the present analysis is based on the following hypotheses about long-term memory: 

• The inferential knowledge needed to solve spatial arrangement problems is stored 
* (procedurally) inside the TRNS, INT, and ANSW operators. 

• Partial results are stored in long-term memory. More precisely, the current knowledge state is 
stored after each application of the TRNS and INT operators. Stored knowledge states can 
be retrieved and re-instated as the current state. 16 

• The long-term memory trace contains only the r>>*. from the initial state to the current state, 
i.e., search paths over which backups are made are deleted from memory. 

• Long-term storage is used for various book-keeping purposes. For example, the READ 
operator is able to get the next premise from the problem text, i.e., the premise immediately 
below the lasi premise to be read. This presupposes some memory of which premise was 
last read. Similarly, the SCAN operator can continue a scanning pattern from the last point 
of scanning, which presupposes some memory of where the previous scan was broken off. 

3.3. Diagnosing the Subject's Solution Path 

Figures 3-4 and 3-5 define the subject's problem space. If the hypothesis they express is correct, 
they specify generatively the entire set of paths subject SI6 cou/d have traversed while solving the Block 
Problem. The next step in the construction of an explanation of her performance is to identify which path 
she actually traversed. This is done by interpreting the protocol'fragmerHS in terms of the problem space 



u u He ?°!L a to™**** "J of *", sublet capabilities must include an operator that prepares for backup by storing the current state 
In teng-term memory, and a backup operator which can retrieve a stored knowledge state. These operators are defined in Ffaur« 
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operators and their inputs and outputs. The following three interpretative principles were applied in the 
present analysis: 17 

1. Verbalizations from the subject are interpreted as outputs from operators, unless this would 
complicate the over-all interpretation. 

2. Backups are assigned the shortest scope which is consistent with the evidence. 

3. Verbalizations which are identical to sentences in the problem text are assumed to be the 
result of reading aloud from the problem card, unless this complicates the over-all 
interpretation. In cases of doubt, the audio tape was consulted. 

These rules are applied below in mapping the protocol in Figures 3-2 and 3-3 onto the problem space 
deflned^by Figures 3-4 and 3-5. The result is an hypothesis about the subject's solution path that can be 
displayed graphically in the form of a so-called Problem Behavior Graph (PBG). 18 The PEG generated 
from the protocol in Figures 3-2 and 3-3 is shown in Figure 3-6. The first subsection below describes how 
the PBQ is generated. The second subsection asks whether the path hypothesis reveals any unusual or 
special events, events which are in special need, as it were, of being explained. 

Mapping the protocol onto the problem space 

In the beginning the subject is simpiy reading the probiem text, as she has been instructed to-do 
(F1-F7). Presumably there is some change of goals between F7 and F8, from read the textlo solve the 
problem, hut there is no trace of it in the protocol. She then begins her solution attempt by reading the 
first premise (F8). Her next step cannot be interpreted within the problem space: She reflects on the 
meaning of the term "between" (F10-F11). This does not produce any new result in terms of the problem 
space. (This is the oniy step outside the problem space.) In F12 she is back in her attempt to solve the 
problem. She continues to read the premises in the order in which they are written, i.e., every time she 
reads, she reads the next premise (F12-F14). Upon reading the fourth premise, she notices the repeated 
occurrence of the yellow block, and begins to make inferences. The content as well as the phrasing of 
the fragments F16 and F17 implies knowledge of the internal relations between, the yellow, blue, and red 
blocks. I interpret F16 as an application of the TRNS operator to premise four and F17 as an application 
of IN: to premise two. The question of F15 then remains. The tone of voice on the tape does not 




to compare the interpretative principles used here with the discussion of protocol interpretation in 
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support the hypothesis that premise two is being re-read at this point. Working memory considerations 
show that premise two should still be available. Therefore, It has been interpreted as a rehearsal, i.e., not 
as a generation of a new resuft. The output from the sequence F15-F17 is probably tagged as "unclear" 
(F18), since ft Is followed by a consolidation backup (F19-F20). 

The process is then repeated. In F21-F23 she reads again the premises in the order in which they 
are written on the problem card. There is one difference: after having translated premise four (F24), she 
re-reads premise two (F25) before it is integrated. No such re-reading was needed in the previous 
episode. However, as described above, in that episode she felt a need to rehearse premise two before 
translating premise four. She probably has some problem with working memory at this point, even though 
the assumption of a short-term capacity of three chunks predicts that premise two should still be 
available. At the end of this passage (F27-F29) she again has the result "yellow blue red". 

She now continues by reading the two premises she has skipped, namely premise 3 (F30) and 
premise 1 (F31), and the premise she has not yet looked at, namely premise 5 (F34), in that order. In 
F32 she is trying to do something with the black block, but it is unclear what. She fails, backs up, and 
re-reads premise 1 instead (F33). 

In F35 she tries to work on the white block, but fails and backs up (F37). She tries again, and 
succeeds, achieving the result "white red black green" (F40). It must have happened through the 
translation of premise five, followed by an integration of premise one. In F42 she discovers a 
contradiction between her partial result and premise 3. This leads to a backup and revision of her mental 
model to "white green black red" instead (F44). She then integrates premise three into this model, 
because in F46 she says that the white block is bottommost, a conclusion which only follows from the fact 
that the green is bottommost but one, combined with the fact that the white is below the green. 

The subject then reminds herself that the blue block is still missing from the model and reads premise 
four which says that the blue block is immediately below the yellow one (F48-F49). There is no evidence 
that she does anything with this premise. (Since neither the blue nor the yellow block are as yet placed in 
the model, no extension of the model is possible at this point.) Instead, she reads premise two (F50), and 
integrates It (F51-F52). After that, the yellow block is part of the model, and premise four can be 
integrated (F53-F54). Finally, having placed all the objects in the model she reads the question (F55) and 
derives the answer (F56). 
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33 



READ 



THIRDPREM 



19 

F31 

(UNgr) 



READ 



FIRSTPREM 



20 

|T32 

(W bk rd gr) 



21 

[FIT 



READ 



FIRSTPREM 



22 

F34 

(W bk rd gr) 



READ 



FIFTHPREM 



23 

F35 

(U wh bk) 



TRNS 



(U wh bk) 



24 

F37 

(BTM wh bk 
TOP) 



25 



26 



27 









F38 


READ 


F38.1 


TRNS 


F39 
(BTM wh bk 


ro 










FIFTHPREM 


(U wh bk) 


(U wh bk) 


TOP) 
























31 






28 




29 




30 










INT 


F42 

(BTM wh 


READ 


F42.1 . 


INT 


F42.2 

imp(BTM wh 










(W bk rd gr) 


rd:bk:gr TOP) 


THIRDPREM 


(UNgr) 


(UNgr) 


rd:bk:gr 
TOP) 











Figure 3-6. Problem Behavior Graph for Subject SI6's Solution Path for the Block Problem, Part2. 



i 

<D 

> 
3 

0) 

*< 
CO 



INT 



(W bk rd gr) 



32 

F43.1 

(BTM wh 

gr:bk:rd TOP) 



READ 



THIRDPREM 




INT 



(UN gr) 



34 

F47 

(BTM:wh:gr: 
bk:rd TOP) 



READ 



35 
F50 



FOURTHPREM 



(Ul bu yw) 



READ 



SECPREM 



36 

F51 

(A yw rd) 



INT 



(A yw rd) 



37 ' 

F53 

(BTM:wh:gr: 
bk:rd yw TOP) 



INT 



(Ul bu yw) 



38 

F55 

(BTM:wh:gr: 
bk:rd:bu:yw 
TOP) 



READ 



QUESTION 



39 

[F56 

(Ul ? bu) 



40 



ANSW 


F56.1 


(Ul ? bu) 


(ANSWER rd) 



Figure 3-6. Problem Behavior Graph for Subject SI6's Solution Path for the Block Problem, Part3. 
The notation used in this figure is introduced in Figures 3-4 and 3-5. 
In order to compress the figure, the following abbreviations are used for the predicate terms: 
A s Above, Al = Immediately Above, AM = Topmost, AT = Topmost-But-One 
U s Under, Ul = Immediately-Under, UM = Bottommost, UN = Bottommost-But-One 
I = Adjacent, and W = Between. 
The following abbreviations ar used for the color terms: 
bl = block, bu = blue, gr = green, rd = red, yw = yellow, and wh = white. 
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The above path hypothesis is summarized graphically in the Problem Behavior Graph (PBQ) In Figure 
3-6. 19 Since the PBQ contains 40 nodes and the solution time was 220 seconds, the residence time, I. 
e., the time the subject spent in each knowledge state before deciding which operator to apply, was 5.5 
seconds, a result that is compatible with other analyses of think-aloud protocols (Newell & Simon, 1972). 

Special events 

Given the above interpretation of the subject's performance, we might ask if the solution path 
exhibits any remarkable features. Are there any events that are in particular need of explanation, as it 
were? There are five such events, or groups of events. 

First, as the attentive reader has noticed, there is no trace of the partial result "yellow blue red" (which 
is achieved in fragments F16-F17) in the latter half of the protocol. SI6 creates the ordering "white red 
Mack green" (F40) and then continues to integrate the information about the yellow, blue, and red blocks 
into this ordering, as if she had no previous knowledge of their relative positions. Somewhere in the 
interval F29-F33 she forgot the mental model she was building. The problem is to explain why such a 
memory failure occurred at this point, but nowhere else in her Solution attempt. 

Second, the discovery of the contradiction between her mental model and premise 3 in F42 is crucial 
for the subject's solution. How did it come about? Premise three happens to be the only premise in the 
problem which could have shown her that the result achieved in F40 was wrong. What made her re-read 
this premise at such an appropriate time? Was it a chance event, or was she looking for such 
information? If she was looking for it, how did she know she needed it? 

Third, in the beginning of the solution attempt, the subject rehearses premise 2 (F15); in the next pass 
over the premises, she re-reads premise 2 in the corresponding position (F25). In both cases, the 
assumption of a three-chunk working memory predicts that premise 2 should be available in working 
memory at that point. Thus, both the rehearsal and the re-reading are in need of explanation. 

Fourth, in deriving her first partial result, "yellow blue red", the subject worked with the model from the 
top and downwards (F16-F17). But later in the protocol, while constructing the sequence "white green 
black red", f he verbalizes her model from the bottom and upwards instead (F40). 
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Fifth, there are four backups in the protocol which are not followed by the exploration of new paths in 
the problem space, but by. repetitions of previously performed Inferences (F19, F28.1, F32, and F37). I 
caH them "consolidation backups". There are two questions to be asked about each such event: "Why 
does it occur when It does?" and "What determines its scope?" (i. e., how many previous steps are 
repeated?). 

3.4. Diagnosing the Subject's Strategy 

The solution path (the PBQ) is a low-level theory or explanation for the observed performance (the 
protocol). The next step In the diagnosis is to invent a higher-level theory that explains the solution path. 
Such an explanation takes the form of a strategy for solving spatial arrangement problems which will 
generate the hypothesized path when applied to the Blocks Problem. 

It would be desirable to mechanize the process of inferring heuristics that explain a particular solution 
path. Several Artificial Intelligence systems have been proposed that invent a strategy hypothesis, given 
a protocol (Waterman & Newell, 1971), a problem space (Ohlsson & Langley, 1984, in press), or a 
solution path (Langley, Ohlsson, & Sage, 1984). Langley, Wogulis, and Ohlsson (this volume) report 
some recent research with respect to this problem. 20 However, such systems are not yet in practical use, 
so the practitioner, of trace analysis has to be prepared to guess the subject's .strategy, and then 
evaluating his guess by applying it to the path (see next section). 

This section hypothesizes a strategy for SI6 and the next section evaluates that hypothesis. I first 
point out some global properties of Si6's style of problem solving, and then describe her problem solving 
heuristics in detail. 

Global comments on SI6's strategy 

There is a strong recency effect in SI6's protocol. The subject"? inferences always deal with newly 
created information. Previous results never seem to confuse her, nor does she make use of them. 

There is evidence that the inferential operators TRNS and INT are applied only when certain patterns 
of information are present. For instance, the subject reads four premises in the beginning of her problem 
solving attempt before she applies the TRNS operator, apparently waiting for some particular condition to 
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FPM(<model>) Find a proposition related to the current model. FPM searches working memory for a 
proposition wHh one of Its arguments already placed In the model. It returns that 
proposition, If any, or else It falls. 

FPP(<proposition>) Find a proposition related' to a given proposition. This operator searches worWng 
memory fa' a proposition related in a particular way. to the proposition given as 
argument, ft returns that proposition, If any, or else fails. FPP is looking for a 
chaining pattern^. e. f a palr-of .blnary.relatfons.such.that thesecond argument of the 
first proposition is the same as the first argument of the second, e. g., (R x y)(P y z). 

GMO(<model>) Generate missing objects. This operator compares the current model and the text, 
and returns a tet of objects which are not yet induded in the model. If H cannot find 
any missing object, it falls. 

SCAN(<probe>) Scan the text for the element described by the probe. This operator takes a probe as 
Input, and looks through the text for items that conform to the description in that 
probe. The probe can be nn object, in which case SCAN finds the first premise that 
mentions that object. The probe can also be the constant UNUSED, in which case 
SCAN finds the first premise which has not yet participated In any inference. It 
returns a description of (the location of) the item H finds. 

BKUPO Backup. This operator retrieves the knowledge state that was current immediately 

before the last TRNS or INT inference, and reinstates K as the current knowledge- 
state. 



PREB() 



Prepare for backup. This operator stores the current knowledge state in long-term 
memory. It applies immediately before a TRNS or INT inference. 



Figure 3-7: Auxiliary problem solving operators for subject SI6. 
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be satisfied before she starts building her mental model. Similarly, the INT operator is not always applied 
as soon as there is a new proposition in working memory, but only under certain circumstances. Detailed 
hypotheses about the patterns she is looking for are stated below. 

The subject accesses the external display according to different heuristics during different phases of 
her problem solving, in the beginning...she is reading the premises in the order in which they are written. 
After the first application of the TRNS operator she looks around for information which has not yet been 
used.;- Finally, at the end, she is searching for information about particular objects. 

The subject waits until the end to read the question. This confirms the data-driven character of her 
processing; a goal-driven system would begin with the question. 

Formal description of SI6' strategy 

In order to describe SI6's strategy as an information processing system, four new operators, two 
attentional (F?P and FPM) and two perceptual (QMO and SCAN), are needed. They do not change the 
knowledge state as defined in Figures 3-4 and 3-5, but they control attention, find arguments for the other 
operators, and access the external display. They are defined in Figure 3-7, which also defines the two 
backup operators (BKUP and PREB). 

The subject's strategy is here represented as a collection of heuristic rules. The rules are stated in a 
particular format known as a production system, in this format each rule has a condition, a conjunction of 
descriptive clauses, and an action, a list of problem solving operators. The interpretation of the rule is 
that if.a^knowiedge.state-Aatis^ be 
carried out in that state. Production system models are common in the study of human cognition. The 
reader is referred to Davis and King (1976), Hunt and Poitrock (1974), Klahr, Langiey, and Nedies 
(1987b), and Waterman and Hayes-Roth (1978) for general overviews and discussions of production 
system languages. Although the production system formalism was introduced into psychology in 
connection with trace analysis (Noweli, 1966; Newell & Simon, 1972), there is no inherent conceptual 
conneciion between trace analysis and production systems. Other formalisms for the representation of 
problem solving strategies could be used to express the result of trace analysis. 

The production system model of SI6 on the Block Problem is shown in Figure 3-8. The notation used 
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A <qu««tion> <modal> «=> ANSH (<question>) 
CI new (FAIL GMO) READ (QUZSTIOM) 

Xla <model> new<propo«ition> «—> r*M(<raodel>) (»> proposition); 

IKT (proposition) 

12 new<nodel> <proposition> «x> iNT(<proposition>) 

ebs<nodel> new<proposition> . 1 <proposition>. 2 «*> 

FPP(<propc*ition>.l) («> proposition); 
TRNS «propc ^ition> . 1) 

B inqp<model> ■&>> BKUP() 

R3a ntw<«xprts8ion> (REMAINS * NONE) «> GMO (<model>) (»> object); 

SCAN (object) («> premise) ; 
READ (premise) 

R2« new<expression> (HASMODEL m YES) *=> SCAN (UNUSED) («> premise) ; 

READ (premise) 

Rl new<expression> => READ (NEXTPREM) 
SI BEGZM «=> READ (FZRSTPREM) 



Figure 3-8: Production system mode! of subject SI6. 
Is a variant of the standard BNF notation. 21 This notation is useful "for discussing production systems, 
because It imposes some discipline on the statement of the production rules white at the same time 
allowing us to abstract from many of the technical details needed to make a running program. Below I 
give a natural language paraphrase of, and sometimes a comment to, each production rule. 22 
A When the question- has justbeen read, and a model is available, try to infer the 

answer. The condition on this rule is very general, but SI6 does not read or attend to 
the question until she is alieady convinced that the model is complete. Hence, the 
fact that the question has been attended to is itself an indication that the model is 
completed, and that the ANSW operator should be applied. 

01 When there are m more missing objects, read the question. The fact that there are 

no more missing objects is a sign that the model is complete and that the problem 
solving process can move into the question-answering stage. 



21 The rules for this notation can be found In Newell & Simon (1 972, pp. 44-46) 

^ri^L^^ "^J* 0 * ^ e,ab0fale ,abe,ln 9 of lh ° PW**«on rules. The labels are Intended to facilitate 
comparison between this production system and other production systems for the same domain in other publications. 
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,1a • men both a new proposition and a model are available, check if the proposition has 

the right relation to the model, and, if so, try to integrate ti The "right relation" Is 
defined by the FPM operator: It returns a proposition that has at least one of its 
arguments placed In the model. 

12 men a n&w mod*' has been derived and there ls at least one unused proposition in 

working memory, then try to integrate that proposition. 

T1a When there is no model in working memory, but at least one new and one old 

proposition, then check Whether they have the right relation to each other, and, if so, 
try to translate the most recent of them into a model. The "right relation" is in this 
case defined by the FPP operator: It is a chaining pattern like "(x R y)(y Q z)". The 
three rules 11a, 12, and T1a regulate the effort to draw new Inferences from newly 
created information. 

B When the model contradicts the given information, then back up. 

R3a men premises have been used at least once and there is nothing, else to do, then 

find one or more missing objects, locate the premises which deal with those objects 
and read those premises (regardless of whether they have been read before or not). 

R2a When a hew model has just been achieved and there are still unused premises, then 

read those premises. 

R1 When anew mod 01 "as pst been achieved, read the next premise. The productions 

R3a, R2a, and R1 represent three different heuristics for how to access the external 
display. 

S1 Start the problem solving process by reading the first premise. 

In summary, the subject begins by reading the premises in the order in which they are stated in the 
problem text. When a chaining pattern appears, sh,j starts building a mental model. Having begun 
building a mental model, she scans the problem text for unused information. Whenever she extends her 
mental model, she tries to integrate any unused prepositional information which is available in working 
memory. Having considered all premises without completing her model, she identifies specific objects 
which are missing from the model, and reads any Information-old or new-that is available about them. 
When the model Is complete, she reads the question, and answers it by reading off the answer from the 
model. 
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3.5. Evaluating the Strategy Hypothesis 

The solution path in Figuts:3-6 is an hypothesis about the sequence of thoughts the subject had as 
she solved the Block Problem. The list of production rules in Figure 3-8 is an hypothesis about her 
problem solving strategy. We do not yet know whether the strategy explains the path or not. A strategy 
hypothesis must be evaluated by applying It to tho relevant path. Its justification lies in its ability to 
generate or reproduce the solution path. 

The basic method of applying a production system to a solution path is to ask for each state-operator 
pair along the path whether there is some production rule which has Its condition satisfied in that state 
and which has. that operator as its action. If there is such a rule, that step is covered by the production 
system. If not, the system has made what is known as an error of omission. The method is complicated 
by the fact that several different rules might have their conditions satisfied in one and the same state, and 
by the fact that the path hypothesis is necessarilyJncomplete, i. e., it cannot contain all the mental steps 
the subject actually went through. An explanatory procedure which takes these aspects into account Is 

In the present analysis, the following procedure was used while applying the production rules in 
Figure 3-8 to the solution path shown in Figure 3-6. The reader might want to compare this procedure 
with the discussion in Newell and Simon (1972, pp. 197-199). 

1. Suppose that the analysis has proceeded to the nth node in the PBG. The step to be 
explained next is the occurrence of the operator Q leading out from that node. A list is 
made of all the productions which have such conditions that they could be evoked at that 
node. Tht/ production at the top of the list is assumed to have been evoked. Its action part 
is compared to the link in the PBG; if it can generate the operator Q, the step leading out 
from the nth node has been explained. The resulting change in the knowledge-state is 
computed, and the analysis proceeds from the next node. 

2. If the action-part of the topmost production cannot generate the operator Q, the protocol is 
scanned for evidence which contradicts the assumption that the production was fired. If 
there is no such evidence, the production is assumed to have fired. An node is then 
interpolated between node n and node (n+1). 

3. The process wow continues, until either of the following two events occur: 

• The production system finally generates an occurrence of tho operator Q, without 
having contradicted any evidence in the protocol. If this happens, the whole 
sequence of production occurrences and the corresponding nodes are accepted as 
part of the solution path. The node which in the PBQ appears as the nth node, will be 
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replaced by a sequence of nodes. The first node in the sequence win be identical to 
the nth node, and the last link in the sequence will be the occurrence of the operator 
Q. The occurrence of the operator has then been explained, and the next node Is 
computed, and the analysis proceeds from It. 

• The production system may finally generate some production occurrence which 
cannot be reconciled with the protocol. Then the entire sequence of production 
occurrences Interpolated after the nth node is discarded. The analysis is then 
resumed at the nth node. The topmost production Is erased from the list of 
productions which could have fired at that node. The top-most among those 
remaining Is then assumed to have fired, and the entire process Is repeated. 

4. If It happens that none of the productions which could have fired at the nth node Is capable 
of giving rise "to an explanation of the occurrence of the operator Q, the conclusion is that 
the production system cannot explain what happened at that node. A question mark is 
entered, the change caused by the operation Q is computed, and the analysis resumes 
from the (n+/;th node. 

In order to evaluate how well the production system explains the solution path we have to consider a 
number, of different dimensions, the most important of which are coverage, simplicity, and realism. 

Coverage. How many of the knowledge states in the complete solution path are covered by the 
production rules? There are 48 states, three of which lie outside the problem space. Of the remaining 45 
nodes, 42 (93%) are covered. The corresponding figures for the Problem Behavior Graph are 37 and 31 
(84%). (The figures differ because the procedure for applying the production system allows the 
interpolation of states between the nodes in the PBG.) 

Another aspect of coverage is the number of special events in the solution path which the account 
explains. The production system explains the working memory failure in fragment F29-F33. it also 
explains the discovery of the contradiction in F42. However, the production system does not explain the 
rehearsal of premise 2 in F15, the re-reading of premise 2 in F25, the change in the order in which the 
mental model is verbalized, or the occurrence of the two consolidation backups in F28.1 and F32, nor 
does it explain the scope of any of the consolidation backups. 

Simplicity. Taken by itself, an analysis of coverage is not decisive. The problem of coverage can 
always be solved trivially by adding production rules until every step along the solution path is cove; ed by 
some rule. In the limit, one could add a separate production rule for each step. Therefore, the drive 
towards completeness must be balanced by a concern for simplicity. 
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The number (^different productions In Figure 3-8 is 10. The average number of occurrences per 
production In the complete solution path Is 4.8. There are three productions which are used only once: 
81, A, and B. S1 and A begin and end a solution process; they fire of necessity only once each. B is the 
production which causes a backup upon the discovery of a contradiction; it fires only once because the 
subject discovered a contradictic.i only once. In short, each production rule adds general explanatory 
power to the strategy hypothesis, rather than just ad hoc coverage of some particular step. 

. Realism. The production system formalism Is a general format for the representation of procedures, 
but all production rules are not equal, psychologically speaking. In order to be psychologically plausible 
rules,must correspond to pieces of knowledge. The strength of a trace analysis is a function of to what 
extent; H generates weird, complicated, or incomprehensible rules which have no other function than to 
reproduce the particular observed behavior, and to what extent H generates rules which correspond to 
useful pieces of heuristic knowledge. 

The subjective way of deciding this is to inspect the production system and reflect on each rule, 
intuiting whether the rule makes sense and whether it is arbitrary. A more intersubjectively valid method 
Is to translate the set of production rules into a running computer program, and then run the program on 
other tasks than the one the subject solved. If the program can solve other tasks, then the production 
rules are not arbitrary constructions specific to the observed path, but constitute a problem solving 
strategy of some generality. 

The production system in Figure 3-8 was translated into a computer program. The language used 
was PSS, a production system language designed by the author (Ohlsson, 1979). It shares a family 
resemblence to such languages as PSQ (Newell, 1973), OPS5 (Forgy, 1981), PRISM (Langley, 1983), 
and ACT (Anderson, 1983). The entire program is reproduced in Appendix A. The program solved the 
Blocks Problem correctly, generating a solution path which corresponds closely to the solution path by 
SI6, except for the lack of consolidation backups. In particular, the forgetting of the partial result "yellow 
blue red" is reproduced by the program, as well as the discovery of \he contradiction with the giver, 
information in F42. The program was also run on fourteen other spatial arrangement problems of varying 
difficulty (Ohlsson, 1980a). It solved seven of them correctly. The computer runs showed that the 
program succeeds on some.^patial arrangement problems of equal complexity as the Block Problem, but 
falls on others, the program also solved 5 out of 6 spatial arrangement problems of lesser complexity, 
but failed to solve any problems of higher complexity. The main weakness of the program is that it lacks 
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heuristics for how to proceed when either the FPP or the FPM operator fails. This accounts for the failure 
on the simpler problems. For the more complex problems, the main source of failure was insufficient 
working memory capacity. The pattern of results is similar to what one would expect from a human 
subject. 

In summary, the strategy hypothesis does rather well on each of the three basic evaluation 
dimensions. With respect to coverage, it handles almost ai| events in the think aloud protocol. The 
events which are not explained - the rehearsal of premise 2 in F15, the re-reading of premise 2 in F25, 
the change in how the model is read out, the occurrence and scope of consolidation backups - are all 
related to working memory capacity. The first-approximation theory of working memory used in this 
analysis~a box with space for three chunks of information-is, not surprisingly, too course to capture the 
details of how working memory influenced the problem solving effort. With respect to simplicity, the 
strategy hypothesis contains no more than ten rules, each of which covers, on the average, five nodes in 
the path. With respect to realism, computer imptetraniation proved that the strategy can solve other 
spatial arrangement problems than the one it was designed to solve. 

3.6. A Do-It- Yourself Summary 

The result of the trace analysis is a description of subject Si6 in terms of her problem space and her 
problem solving strategy, and a description of her performance in terms of a solution pafo. The 
description claims that she successively integrates the propositional information given in the problem text 
Into a mental model of the linear ordering, until the positions of all objects have been determined.. Her 
main difficulty in dealing with the task is that at each point of the process she has to search the problem 
text for some premise which will enable her to infer the next extension of her model. While canying out 
the search through the problem text, the mental model she has achieved up to that point is subject to 
working memory decay. The major determinant of the shape of her solution effort is not her spatial 
knowledge, but her strategy for attention allocation. 

In order to attempt this kind of cognitive diagnosis the reader should collect a think-aloud protocol 

from a iask he is interested in, and then apply the following explanatory procedures: 
1 . Identify the subject's problem space: 

a. Construct a representational language for the task by noticing the concepts and 
representational formats the subject is using in talking about the task. 

b. Define a set of operators based on passages in the protocol whicn lead to new 
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c. Hypothesize the goal of the subject. 

d. Hypothesize a limit on the subject's working memory capacity. 

2. Generate a solution path by mapping each fragment in the protocol onto some expression 
In the representational language. If the expression represents new knowledge about the 
task, then infer the application of an operator. The solution path is a description of the 
observed performance in terms of the problem space. 

3. Invent problem solving heuristics which capture the regularities in the solution path. 

4. Evaluate the strategy hypothesis by investigating its coverage, simplicity, and realism. 

5. Implement the strategy as a computer program and observe its performance on the 
experimental task, and on other tasks as well. 

4. Implications for Standardized Testing 

The process of generating an information processing model with the help of trace analysis is a 
protracted process involving many decisions and* much trial and error on the part of the analyst 23 
Standardized testing, on the other hand, requires that a description of cognitive functioning can be 
achieved with little enough effort and in short enough a time to be useful in practical contexts. The 
purpose of this section is to discuss the nature of diagnostic tests that build on information processing 
concepts, and the role of trace analysis in the construction of such tests. 

The psychometric approach to ^standard^teed^jestlngjs based on Ihejwo Jdeas ..of.measurement.and- 
standardization. I analyze these cornerstones of the testing movement in the first two subsections below. 
The results of trace analysis are, I believe, incompatible with the idea of measurement, but quite 
compatible with the idea of standardization. I then propose a methodology for the construction of 
standardized tests based on information processing concepts. This admittedly speculative proposal is 
called theory referenced lest construction. 

There are, of course, many different bridges to build between the psychometric and the information 
processing traditions. The reader might want to compare the bridge build here with those constructed by, 
for example, Carroll (1976), Cooper (1982), Qlaser (1986), Hunt (1986), Just and Carpenter (1985), and 



J*The analysis presented in this chapter took approximately sbc weeks to carry out. The protocol was selected from a corpus of 
fifty protocols. The analysis of the entire corpus took more than two years. 
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Snow (.1960). A comparative analysis of different conceptualizations of the relation between 
psychometric and information processing methods would be Interesting, but falls outside the scope of the 
present chapter. 

4.1 . Trace Analysis and Measurement 

The psychometric tradition attempts to describe cognitive functioning with a measure, or, more 
accurately, a set of measures, defining a point in a multidimensional space (Nunnally, 1967; Sternberg, 
1985). But analyses such as the one presented above invalidate this type of description. A set of 
measures cannot accurately represent the nature of SI6's cognitive processes, for two reasons. 

First, the operation of a cognitive mechanism depends essentially on lis structure. By "structure" I 
mean the breakdown of the mechanism into parts, and the interactions between those parts. For 
instance, the spatial reasoning of SI6 depends critically on the interaction between her attention allocation 
and her spatial inferences, as well as on the interaction between her problem solving strategy and her 
short-term memory capacity. The abstraction involved in expressing her spatial reasoning ability as a 
measure would inevitably hide those interactions. 

Second, the operation of a cognitive mechanism depends essentially on the content of its knowledge. 
The crucial feature of spatial reasoning is not how many inference rules a person knows, but exactly 
which rules he knows. The runs with the computer model of SI6 proved that a rule that is necessary for 
thejoluti^^a one problem^ 

the same level of difficulty (as measured, say, by the number of inferences required to reach the solution). 
Measures of spatial reasoning ability inevitably abstract from the content of spatial knowledge. 

In summary, cognitive mechanisms are not well described by measures. The major implication of 
information processing concepts with respect to testing is that tests should produce diagnostic 
descriptions that capture the structure and content of cognitive mechanisms. The complexity of the 
analysis of subject SI6 raises the question whether this implication is consistent with the notion of 
standardization. 
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4.2. Trace Analysis and Standardization 

The term "standardized" can be applied either to the behavioral record, to the output description, or to 
the explanatory procedures of a diagnostic method. It has a different meaning in each case. 

The first meaning of standardization is that a test is a fixed set of problems. A test consists of 
problems with known properties that are used over and over again. The practitioner does not need to 
invent diagnostic problems, he can use existing ones. This is one way in which standardization 
contributes to practical usefulness. From the point of view of Enaction Theory, generating behavioral 
records with the help of a fixed set of problems is a great advantage, because the work of constructing a 
psychologically plausible problem space does not have to be done all over again for each new diagnosis. 

The second meaning of standardization is that the purpose of diagnostic inquiry is to select among 
pre-defined explanatory accounts. More accurately, particular diagnoses are instances of well-known 
explanation patterns. For instance, the names of diseases refer to previously specified physiological 
states. A doctor who decides that a patient has, say, pneumonia is not discovering a new disease, or 
inventing a new theory of human physiology, or even constructing a novel account of a patient. He is 
deciding that his current case is an instance of a known explanation schema. Similarly, a car mechanic 
who concW-.. - that a car fails to start because of a broken wire is not constructing a theory, but applying 
a standard explanation type. 24 

_.J*!£[^J|i!!^ IHnvolves-an-element of 

discovery and creative thought precisely because the type of explanation that can account for the 
phenomenon is not known beforehand, but has to be invented as the explanatory effort proceeds. In a 
well-understood field of inquiry, on the other hand, we already know which types of explanation will suffice 
to account for particular types of phenomena. Faced with an instance of a well-understood phenomenon, 
the task of the practitioner is to select which variant of the relevant explanation type w apply. This is, of 
course, a much simpler problem than inventing a new explanation type. For example, a medical doctor 
can diagnose many an infectious disease in a matter of minutes or at most hours, although the research 
that revealed the physiological mechanism of the disease might have taken many years. In short, the 
second meaning of "standardized" is that diagnosis does not aim to invent a new explanation, but to 
select among already known explanations. Diagnostic methods are, by definition, closed methods. 



Z4 Clancey (1985) has developed the difference between solution construction and solution selection in an A. L 



context 
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The Implication of the above argument is that standardized testing Is only possible In a well- 
understood domain. We cannot construct a standardized test for a psychological domain unless we have 
a theory for human performance In that domain, because the task of a diagnostic procedure is to select 
among the explanations provided by such a theory. Theory construction must precede test construction, 
a conclusion already reached by Frederiksen (1986) on the basis of other considerations. 25 This 
conclusion specifies the role of open methods like trace analysis in test construction: Open methods are 
needed for the construction of the relevant theory. 

The third meaning of standardization Is that there exists a well-specified procedure for mapping the 
the set of test responses onto a diagnostic description. One of the great strengths of the psychometric 
approach Is Its repertory of well-specified procedures. Statistical theory provides the psychometriclan 
with well motivated, inter&ubjectivery valid algorithms. But the explanatory procedures used in the 
psychometric approach are based on the idea of measurement, and so cannot be carried over into non- 
quantitative testing. 

In the non-quantitative case diagnosis is a kind of classification (Clancey, 1985). The explanatory 
procedure classifies the pattern of observed responses as belonging to a particular explanation, or, 
equivalently, It discriminates between alternative explanations on the basis of the pattern of responses. 
Recent research in expert systems has shown that complex diagnostic procedures m a variety of 
domains, including medicine and electronic trouble shooting, can be specjed wj|i„|no.ugh.predsign.to.be. 
Implemented on a computer (Clancey, 1985; Hayes-Roth, Waterman, & Lenat, 1983). There is, then, 
reason to believe that procedures for cognitive diagnosis based on information processing concepts can 
be standardized in the form of computer programs, although there exists to date only a handful of 
examples (Burton, 1962; Lewis, 1986; Ohlsson & Langley, in press; Sleeman, 1984; Waterman & Newell, 
1971). 

In summary, the concept of standardization implies (a) mat cognitive diagnosis is based on a fixed set 
of problems, (b) that the purpose of cognitive diagnosis is to select an explanation from a pre-defined set, 
and (c) that the selection of the explanation is based on a well-spedfled algorithm. The theories and 
meuiods of information processing psychology are quite compatible with these requirements. It should 



for^SS«X IU |»'!?h^f , « ,S *" id ! a » 0, , U8, r 9 . t0sls " f, SOarch ' nstwmonts ^ •■■ « Instruments for data collection (rather than 
f»!?2'J!,L Pre * reqUiSit0 ,0r tMt "Mellon, the data required to build that theory must have been collected 
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therefore be possible to design a methodology for the construction of standardized psychological tests 
that build on information processing, rather than psychomerjc, descriptions of mental states. 

4.3. Towards Theory Referenced Test Construction 

The purpose of this subsection is to outline an admittedly speculative proposal for a methodology that 
I call theory referenced test construction. According to this methodology the construction of a 
standardized psychological test proceeds through three phases: theory construction, item production, and 
algorithm design. Each phase will be described in turn. 

Theory construction. The construction of a standardized test for diagnosing, say, spatial reasoning, 
should begin, I propose, with a descriptive investigation of spatial reasoning,, using trace analysis and 
other open and intensive methods that aim for singleton descriptions. The question to be answered by 
the investigation is "What information processing components (representations, operators, heuristics, 
goals, inference rutes. etc.) have to be postulated to explain a wide variety of human behavior in the 
relevant task domain?-. The results of the Investigation are summarized in an information processing 
theory of human performance in that task domain. The function of that theory is to provide explanations 
of particular performances. Diagnosis is the process of mapping a particular performance onto the best- 
fitting explanation. 

We can think of a theory of human performance as a space of information processing modejs. Each 
modelis a specification of an information processing system that can generate (not necessarily correct or 
efficient) behavior with respect to the relevant task. Each model, i. e., each point in the space, represents 
a standard (type of) explanation for behavior in the relevant task domain. To explain a particular problem 
solving performance is to select that model in the space which most closely simulates that performance. 

A model space for spatial arrangement problems has been constructed by Ohlsson (1980b, 1982), 
using trace analysis. A part of this space has been encoded in a strategy grammar, a formal device 
resembling a generative grammar (Ohlsson, '1980a). The model space Is defined by a list of information 
processing components and the rules for how to combine them into particular models. At the most global 
level of analysis there are several basic approaches to spatial arrangement problems. The two most 
Important approaches are the moihod of series formation, which consists in constructing a complete 
mental model of the linear ordering, and the method of elimination, which consists in eliminating all 
possible answers but one. At the ne,t level of analysis each approach Is implemented in several different 
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problem spaces. For Instance, problem spaces for the method of series formation differ with respect to 
whether the mental model discriminates between adjacent and non-adjacent relations or not, with respect 
to whether there is an operator for. posing hypotheses or not, and so on. (Subject SI6 uses the series 
formation method, and her problem space-defined in Figures 3-4 and 3-5~contains a symbolic device for 
discriminating between adjacent and non-adjacent relations, but it does not contain an operator for posing 
hypotheses.) Each problem space, In turn, can be searched with the help of different strategies, each 
strategy being represented by a set of heuristics. For Instance, a strategy may or may not Include the 
chaining heuristic (see rule T1a in Figure 3-8). The approaches, problem spaces, and heuristics make up 
a modeling kit, as it were, out of which particular information processing models can be assembled. To 
assemble a particular model, one selects an approach, then a problem space which implements that 
approach, and then a set of heuristics for searching that space. Ohlsson (t982) showed how different 
subjects can be modeled by different combinations of parts from this space. 

The technique of representating a space of information processing models by a modeling kit was first 
used by Young (1976, 1978) in a study of length seriation in children. He presented a kit of production 
rules for seriation in which individuals at different levels of development are modeled by a different 
.selection of rules. The same format was used by Young and.O'Shea (1981) to describe a model space- 
for multi-column subtraction. Brown and Burton (1978) used a different but related approach to defining a 
space of models for subtraction. They encoded their space of subtraction models in a structure called a 
procedure; net, a network of procedures with calling relations between them. A number of alternative 
versions of the correct procedure are stored at each node in the procedure net. For instance, there might 
be several incorreet versions of the borrowing procedure. By making a particular selection among the 
versions stored at each node in the network, a particular information processing model is assembled, 
representing a standard explanation for incorrect subtraction answers (a so-called bug). Sleeman (1984) 
has produced a procedure space for algebra, based on the notion of selecting a set of rules, possibly 
including some incorrect rules, from a larger set. 

Although examples of procedure spaces exist in the literature, they have not yet become common. 
The proposal made here is that a procedure space should become a standard way of reporting We results 
of descriptive studies of human performance. In particular, I am proposing that a procedure space is the 
first step in constructing a standardized psychological test. The individual procedures In the space 
correspond to particular, pre-defined explanations; the task of a diagnostic procedure is to map an 
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individual onto one of those explanations on the basis of his performance on the test items. 

Kern production. Given a space of information processing models, the next task of test construe On is 
to produce test Hems, problems, that win discriminate between those models In the desired way. A 
problem discriminates between two information processing models A and B if the performance on that 
problem predicted by model A differs In some observable way from the performance on that problem 
predicted by model B. The goal of the Hem production phase Is to find a set of problems that discriminates 
between an members of some given space of models, or that divides the space Into equivalence classes. 

Hem production can be broken down into two processes, item generation and item selection. Both of 
these processes can be automated. A problem generator is a computer program that can generate 
possible test Hems. The art of programming problem generators is currently being explored in research on 
Intenigent tutoring systems (Sleeman & Brown, 1982; Wenger, 1987). In brief, a problem generator needs 
an analysis of the relevant problem type Into fixed and variable parts, and a list of the possible variations. 
For example, problems of the form "x + y - ?• can be generated by replacing x and y with two random 
numbers. A problem generator for spatial arrangement problems would be more complicated to program, . 
because H would have to dieck that the premises H generates make sense when taken together (I. e., 
that the problem being generated has a solution). A problem generator for, say, electronic trouble 
shooting would be more complicated still. But problem generators for most tasks that are of interest to 
test constructors can be programmed wHh reasonable effort. 

After Hem generation comes Hem selection. The fact that information processing models are running 
computer programs can be exploHed in order to automate the selection process as well. By running two 
or more simulation models on a particular problem, one can verify In an intersubjectively valid way 
whether that problem discriminates between those models or not. Models that generate identical solution 
paths for that problem are not discriminated, but models that generate different paths are. For instance, 
spatial arrangement problems that can be solved by integrating the premises In the order in which they 
are written in the problem text do not discriminate between different strategies for attention allocation, but 
other problems do. In short, I am proposing that test items should bo validated by relating them to the 
theory of human performance that constitutes the basis for the test. It is this feature of the methodology 
proposed here that motivates the term theory referenced test construction*. 

Item production can be fully automated by Interleaving item generation and Hem selection. A 
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computer system for Item production would generate an Item, run the relevant modefe on it, and decide 
whether to keep the Hem on the basis of whether it discriminates between those models. The cycle of 
problem generation and model running would continue until the system has found a set of Hems that 
makes the desired discriminations between all the relevant models. That set of problems 's then a test for 
whatever aspect of human cognition is described by that space of models. 

Algorithm design. The relationship between a pattern of responses on a test, on the one hand, and a 
space of information processing models, on the other, can be very complex. If a test is to be useful in 
practical contexts, it must be possible to design an algorithm that quickly selects that model which best 
accounts for any particular pattern of responses. In principle, a pattern classifier consists of a 
discrimination tree that makes successive decisions depending upon the answers to each diagnostic Hem. 
The highly successful DEBUQQY system for classification of subtraction errors (Burton, 1982), and the 
construction of expert, systems for medical diagnosis, "electronic trouble shooting, and similar domains 
(Clancey, 1985) show that complex pattern classification algorithms can be designed and programmed. 

Admittedly, the methodology for test construction outlined here cannot compete with the psychometric 
approach wHh respect to the processing of test responses. Given the psychonKdc Idea of describing a 
mental state as a point In a muHi-dimensional space, standard statistical techniques can be used to 
process the data from any tost, regardless of the problems in the test, regardless of what the test 
measures, and even regardless of changes in the underlying theory, e. g., changes }n the assumptions 
about how many distinct abilities there are. In contrast, the methodology outlined here requires that a 
new classification algorithm is designed for each new test. 

in summary, theory referenced test construction proceeds by (a) constructing a space of information 
processing models, each model describing a possible state of mind, (b) producing a test, i. e., a set of 
items that can discriminate between those models, and (c) designing a pattern classification algorithm 
that selects the best-fitting model for a particular set of responses to the test items. 

the above proposal is admittedly speculative. But the two last phases of the proposed methodoiogy- 
Rem production and algorithm design-rely on standard programming techniques. No conceptual 
advances are needed to realize those two phases of the methodology. The speculative nature of the 
proposal comes to the fore in the first step. It is not obvious that we know how to construct model spaces 
that simulate people with enough accuracy to be used as bases for test construction. The example 
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provided by research oh subtraction skills Is encouraging (Brown & Burton, 1978; Burton, 1982). 
Furthermore, our ability to construct such model spaces Is a function of the quality of our psychological 
theories. Presumably, continued psychological research will lead to better and more accurate theories of 
human cognition, and the better our theories, the more feasible the methodology of theory referenced test 
construction. 
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Appendix A. Simulation Program for S16 

The following Is a runnable simulation model of subject S16. It consists of the production rules In 
Figure 3-8, written In a computer Implemented production system language called PSS (Ohlsson, 1979). 

(PO (ANSWER 3d) «=> SAY(Xl) ; 

STOPALL) 

(PI (HIK <QSTN>) <MODEL> «*> UNMK( (NFW <QSTN>) ) ; 

GOTO (ANSK) ) 

(P2 (NEW (FAIL GMO) ) => UKMK( (NEN (FAIL GMO) ) ) ; 

READ (QUESTION) ) 

(P3A (NTC: <PROP*) <MODEL> »=«> GOTO (INT) > 

(P3B (NEK <PROP>) <MODEL> mmm> GOTO(FPM)) 

(P4 (HEW <MODEL>) <PROP> »=> TJNMK( (NEN <MODEL>) ) ; 

NARK(<PROP> ; NTC:); 
GOTO (INT) ) 

(P5A (ASS <MODEL>) (NTC: <PROP>.l) <PROP>.2 => 

UNMK((1ITC: <PROP>.l)); 
RKRS(<PROP>.2) ; 
RHRS(<PROP>.l) ; 
GOTO(TRNS) ) 

(P5B (ABS <MODEL>) (NEN <PROP>.l) <PROP>.2 => GOTO(FPP)) 
(P6 (IMP <MODEL>) ===«> BKOP()) 

(P7A (NEW <EXPRESSION>) (MISSING: (XI)) 

UNMK( (NEW <EXPRESSION>) ) ; 
DEL£ (MISSING: (XI))); 
SCAN( (XI) ) (»> PREMISE) ; 
READ (PREMISE) ) 

(P7B (NEW <EXPRESSION>) (MISSING: (XI <SEQ>) ) ==> 

UNMK( (NEW <EXPRESSION>) ) ; 
RE PL ( (MISSING : XI <SEQ>) ) ; 

(MISSING: (<SEQ>) ) ) ; 
SCAN ( (XI) ) (=> PREMISE) ; 
READ (PREMISE) ) 

(P7C (NEW <EXPRESSION>) (REMAINS * NONE) =*> 

UNMK( (NEW <EXPKESSION>) ) ; 
NTC (<MODLL>) ; 
GMO (<MODEL>) (=> LIST) ; 
MARK«EXPRESSION> ; NEW) ; 
INS ((MISSING: LIST))) 
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Appendix A. Cont'd 



(P8A (NEW <TOCPRESSION>) (UNUSED: (XI)) 



UNMK( (NEK <SXPRESSION>) ) ; 
DEL ( (UNUSED : (XI))); 
READ (XI)) 



(P8B (NEK <EXPRESSION>) (UNUSED: (XI <SEQ>) ) ==> 



(P8C (NEK <EXPRESSION>) <MODEL> 

SCAN (UNUSED) (*> LIST) ; 
INS ((UNUSED: LIST))) 

(P9 (NEK <EXPRESSION>) X7NMK ( (NEK <EXPRESS T.ON>) ) ; 

BEAD (NXXTPREM) ) 

(P10 BEGIN READ (FIPSTPREM) ) 



UNMK( (NEK<EXPRESSION>) ) ; 
REPL( (UNUSED: (XI <SEQ>) ) ; 



(UNUSED: «SECiO)); 
READ (XI)) 
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Or. John Holland 
University of Michigan 
2313 East Engineering 
Ann Arbor, MI 48109 

Or. Melissa* Holland 

Army Research Institute for the 

8ehavloral and Soda! Sconces 
5001 Eisennower Avenue 
Alexandria, VA 22333 

Or. Robert W. Holt 
Department of Psychology 
George Mason University 
4400 University Drive 
Fairfax, VA 22030 

Ms. Julia S. Hough 
Lawrence Erlbaum Associates 
6012 Greene Street 
Philadelphia, PA 19144 

Or. James Howard 

Oept. of Psychology 

Human Performance Laboratory 

Catholic University of 

America 
Washington, DC 20064 



Or. Earl 'Junt 
Department of Psychology 
University of Washington 
Seattle, /WA 98105 

Or. Ed Hutchins 
Intelligent Systems Group 
Institute for 

Cognitive Science (C-015) 
UCSO 

La Jolla. CA 92093 

Or. Janet Jacicson 
Rijksunlversiteit Groningen 
Biologisch Centrum, vleugel 0 
Kerklaan 30, 9751 NN Haren (Gn.) 
NETHERLANDS 

Or. R. J. K. Jacob 
Computer Science and Systems 
Code: 7593 

Information Technology D. vision 
Naval "Research Laboratory 
Washington, OC 20^75 

Or* Zachary Jacobson 

Bureau of Management Consulting 

365 Laurier Avenue West 

Ottawa, Ontario K1A 0S5 

CANADA 

PftArm.-Chim. en Chef Jean Jacq 
Division de Psychologie 
Centre de -Rech^rches du 

Service do Santa des Armees 
108 Boulevard Pinel 
69272 Lyon Cedex 03, FRANCE 

Or. Robert Jannarone 
Department of Psychology 
University of South Carolina 
Columbia, SC 29208 

Or. Claude Janvier 
Dlrecteur, CIRAOE 
Unlversite' du Quebec a Montreal 
P.O. Box 8888, St. "A - 
Montreal , Quebec H3C 3P8 
CANADA 
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COL Dennis W. Jarvl 
Commander 

AFH'jl 

Brooks AFB, TX 78235-5601 

Op. Robin Jtffpits 
Hewlett-Packard laboratories 
P.O. Box 10490 

Palo Alto, CA 94303*0971 

Or; Douglas H. Jonas 
Thatcher Jones Associates 
P.O. Box 6640 
10 Trafalgar Court 
Lawrencev1lle f . NJ 0864B 

Or. Marcel Just 
Carnegie-Mellon University 
Department of Psychology 
Schanlay Park 
Pittsburgh, PA 15213 

Or. Oanial Kahnaman 
Department of Psycholoay 
'University of Calif orriU 
Berkeley. CA 94720 

Or. Milton S. Kat? 
Army Research Institute 
5001 Eisenhower Averse 
Alexandria, VA 22333 

Or. Steven W. Keele 
Department of Psychology 
' University of Oregon 
Eugene, OR 97403 

Dr. Wendy Kellogg 

IBM T. J. Watson Research Ctr. 

P.O. Bex 218 

Yorktown Heights, NY 10598 

Dr. Oavid Moras 
University of Michigan 
Technical Communication 
College of Engineering 
1223 E. Engineering Building 
Ann Arbor, MI 48109 



Or. Walter Kint$ch 
Department of Psychology 
University of Colorado 
Campus Box 345 
Boulder, C0 r 0302 

Dr. David Klahr 
Carnegie-Mellon University 
Department of Psychology 
Schenley Park 
Pittsburgh, PA 15213 

Mr. Al Xlelder 
Army Research Office 
P.O. Box 12211 
Research Triangle Park 
North Carolina 27709*2211 

Or. Ronald Knoli 
" Bell Laboratories 

Murray H1ir; NJ 07974 

Dr. Stephen Kosslyn 
Harvard University 
1236 William James Hall 
33 Kirkland St. 
Cambridge, MA. 02138 

Dr. Kenneth Kotovsky 
Department of Psychology 
Community College of 
Allegheny County 
800 Allegheny Avenue 
Pittsburgh, PA 15233 

Dr. David H. Krantz 
' 2 Washington Square Village 
Apt. 0 15J , 
Ne» York, NT 10012 

Dr. Patrick Kyllonen 
325 Aderhold 

Department of Educational 

Psychology 
University of Georgia 
Athens, 6A 30602 

Or. David R. Lambert 
Naval Ocean Systems Canter 
Code 44 IT 

271 Catallna Boulevard 
San Oiego, CA 92152-6800 
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Dp. J1H Larlcln 
Carneglc-Mellon University 
Oepartmer of Psychology 
Pittsburgh; PA 15213 

Dr. R. W.'lawler 

ARI 6 S 10 

5001 Elsenhower Avence 

Alexandria, VA 22333-5600 

Or. Alan M. Lesgold 
Learning Rtstarch and 
Development Canter 

University of Pittsburgh 
Pittsburgh, PA 15260 

Or* Alan Leshner 

Deputy Division Director 

Behavioral and Neural Sciences 

National Science Foundation 

1800 6 Street 

Washlnoton, DC-20550 

Or. Jim Levin 
Department of 

Educational Psychulogy 
210 Education Building 
1310 South ■ S1xt!i Street 
Champaign, IL. 61820-6990 

Dr. John Levlne 
Learning R&0 Center 
University of Pittsburgh 
Pittsburgh, PA 15260 

Or. Clayton Lewis 
University of Colorado 
-Qs?srt£«r»t of Computer Science 
Campus Box 430 
Boulder, CO 80309 

Matt Lewis 

Department oT Psychology 
Carnegie-Mellon University 
Pittsburgh, PA 15213 

Library. 

Naval" War College 
Newport, RI 02940 



Library. 

Naval Training Systems 
Center 

Orlando. FL 32813 

Science and- Technology Olvision . 

Library of Congress 
Washington. DC 20540 

Or. Jane Malin 

Mail Code SR 111 

NASA Johnson Space Center 

Houston, TX 77058 

Or. Sandra P. Marshall 
Dept. of Psychology 
San Diego State University 
San Diego, CA 92182 

Or. Humberto Maturana 
University of Chile 
Santiago 
CHILE : 

Or. Richard E. Mayer 
Department of Psychology 
University of California 
Santa Barbara. CA 93106 

Or.* James McBr.ide 
Psychological Corporation 
c/o Harcourt. Brace. 

Javanovlch Inc. 
t:50 West 6th Street 
San Diego, CA 92101 

Or. James L. McGaugh 
Centsr for the Neurobiology 

of Learning and Memory 
University of California, Irvine 
Irvine. CA 92717 

tor. Gail McKoon 
CAS/Psychology 
Northwestern University 
1859 Sheridan Road 
Kresge #230 
Evostcn, IL 60201 

Or. Joe McLachlan 

Navy Personnel R&0 Center 

San Dieoo, CA' 92152-6800 
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Or* James S* McMichael 
Hz?y Ptpsonntl Rtstarch 

and Development Ctnttr 
Codt 05 

'San Diego, CA 92152 

Or* Barbara Mwans 
Hunan Resourcos 

Research Organization 
ttOO South Washington 
Alexandria. VA 22314 

Or* Douglas I. Medln 
Department of Psychology 
University of Illinois 
603 S. Daniel Strttt 
Champaign, IL 61820 

Or* George A* Miliar 
Otpartfliant of Psychology 
Green Hall 

Princeton Un1vtrs1ty 
Princeton, NJ 08540 

*0r. Andrew ft. Molnar 
Scientific and Engineering 
Personnel ,-.nd Education 
National Science Foundation 
Washington, DC 20550 

Or. William Montague 

' NPRDC;Codt 13 

San Diego, CA 92152-68G0 

Or* Nancy-MonMs 
Search Technology, Inc.- 
5550-A P«ichtree Parkway 
• Technology Park/Summit 
Norcross, GA 30092 

Or* Randy Mumaw 
Program Manager 
Training Research Division 
HumRRO 

1100 S. Washington 
Alexandria, VA 22314 

Or* Allen Hunro 
Behavioral-Technology 
Laboratories - DSC 
1845 S. E'ena Ave., 4th Floor 
Redondo Beach, CA 90277 
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Chair, Department of 

Computer Science 
U*S» Naval Academy 
Annapolis, MO 21402 

Chair, Department of 

Systems Engineering 
U*S* Naval Academy 
Annapolis, MO 21402 

Technical Director, 

Navy Health Research Center 
P*0* 8ox 85122 
San Diego, CA 92138 

Or, Allen Newell 
Department of Psychology 
Carnegie-Mellon University 
Schenley Park 
Pittsburgh, PA 15213 

Dr. Mary Jo Nissen 
University of Minnesota 
N218 Elliott Hall 
Minneapolis, MN 55455 

Or. A* F. Nordo 

Computer Science and Systems 

Code: 7590 

Information Technology Division 
Naval Research Laboratory 
Vashlngcbn, 0C 20375 

Dr. Donald A. Norman 
Institute for Cognitive 

Science C-015 
University of California, San Oiego 
La Jolla, California 92093 

Oeputy Technical Director, 

-NPR0C 'Code 01A 
San Diego, CA 92152*6800 

Director. Training Laboratwn , 

NPR0C (Code 05) 
San Diego, CA 92152*6800 

Director, Manpower and Personnel 

"Laboratory, 

NPR0C (Code 06) 
San Oiego* CA 92152-6800 



73 



{OSK}<LISPFILES>DRIB0310A.;2 iO-Mar-87 09:27:48 Page 9 

1987/03/09 

Distribution List [Pittsburgh/Lesgold] NR 4422539 



Director, Human Factors 

& Organizational Systems Lab, 

NPRDC (Code 07) 
San Diego, CA 92152*6800 

Fleet Support Office, 

NPRDC (Codt 301) 
San Diego, CA 92152-6800 

Library, NPROC 
Coda P*Q1L 

San Diego, CA 92162-6800 

Technical Director, 

Navy Personnel R&D Center 
^aft Diego, CA 92152.-6800 

Commanding Officer, 

Naval Research Laboratory 
Code 2627 

Washington, DC 20330 

Or. Harold F. O'Neal, Jr. 
School of Education * WPH 801 
Department of Educational 

Psychology & Technology 
University of Southern California 
Los Angeles, CA 90389*0031 

Or* Michael f erim 

Naval Train/ $ Systems Center 

Code 711 

Orlando, FL 32813-7100 

Or. Stellan Ohlsson 
Learning R&D Center 
University of Pittsburgh 
3939 O'Hara Street 
Pittsburgh, PA 15213 

Office of Naval Research, 

Code 114281 
800 N. Quincy Street 
Arlington, VA 22217-5000 

Office of Naval Research. 

Code 1142 
800 N. Quincy St* 
Arlington, VA 22217-5000 



Office of Naval Research, 

Code 1142 PS 
800 N, Quincy Street 
Arlington, VA 22217-5000 

Office of Naval Research, 

Code U42CS 
8Q0N, Quinsy Street 
Arlington, VA 22217*5000 
(6 Copies) 

Psychologist, 

Office of Naval Research 
Branch Office, London 

Box ,39 

FPO New York, NY 09510 

Special Assistant for Marine 

Corps Matters, . 

ONR Code.OOMC 
800 N* Quincy St. 
Arlington., VA 22217-5000 

Psychologist, 

Office of Naval Research 
Liaison Office, Far East 
APO San Francisco, CA 96503 

Or. Judith Orasanu 
Army Researqft Institute 
5001 Eisenhower Avenue 
Alexandria, VA 22333 

Or. Douglas Pearse 

OCIEM 

Box 2000 

Oownsvlew, Ontario 
CANADA 

Or. James w. Pellagrin© 
University of California. 

Santa Barbara 
Department of Psychology 
Santa Barbara, CA 93106 

Or. Virginia E. Penciargrass 
Code 711 

Naval Training Systems Center 
Orlando, FL 32813-7100 
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Or* Nancy Pennington 
University it Chicago . 
Graduate School of Business 
1101 E. 58th St. 
Chicago, II 80837 

Military Assistant for Training and 

Ptrsonntl Technology, 

OUST (R & £) 
Room 30129, Tht Pentagon 
Washington, OC 20301-3080 

Or* Sttvtn Pinker 
Department of Psychology 
E10-018 

Cambridge. MA 02139 

Or* Martha Poison 
Department of Psychology 
Campus Box 048 
University of Colorado 
Boulder, CO 80309 

'Or^Petef Potion 
University of Colorado 
Department of Psychology 
3ou1der, CO 80309 

Or. Michael I. Posner * 
Department of Neurology 
Washington University 

Medical School 
St. Louis, MO 63110 

Or, Mary C. Potter 
Department of Psychology 
MIT (£-10-032) 
Cambridge, MA 02139 

Or, Paul S, Rau 
Code U-32 

Naval Surface Weapons Center 
White Oak Laboratory 
Silver Spring, MO 20903 

Or, Lynne Reder 
Department of Psychology 
Crnegie-Meilon University 
Schenley Park 
Pittsburgh, PA 15213 



Or, James A, Rtggla 
University of Maryland 
School of Medicine 
Department of Neurology 
22 South Greene Street 
Baltimore, MO 2120?. 

Or, Wesley Regian 

AFHRL/MOO 

Brooks AFB, TX 78235 

Or, Fred Reif 
Physics Department 
University of California 
Berkeley, CA 94720 

Or, G11 Rlcard 
Mall Stop C04-14 
Grumman Aerospace Corp, 
Bethpage, NY 11714 

Or, Linda G, Roberts 
Science, Education, and 

Transportation-Program 

Office of Technology Assessment 
Congress of the Umted States 
Washington, DC 20510 

Or, Paul R, Rosenbaum 
Educational Testing Service 
Princeton, NJ 08541 

Or. William 8, Rouse 
Search Technology, Inc, 
5 r *50-A Peachtree Parkway 
Technology Park/Summit 
Norcross, GA ,30092 

Or* Oavid Rumelhart 
Center for Human 

Information Processing 
Univ, of California 
La Jolla. CA 92093 

Or, Walter Schneider 
Learning R&O Center 
University of Pittsburgh 
3939 O'Hara Street 
Pittsburgh, PA Ij260 
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Or. Miriam Schustack 
Codt 51 

Davy Ptrsonntl R & 0 -Center 
San Oiego, CA 9215^-6600 

Or. Marc Sebreehta 
Department of Psychology 
Wesleyan University 
Middletqwn, CT 06475 

Or. Cpllttn M. Seifart 
Intelligent Systems Group 
Institute for* 

Cognitive 'Science (C-015) 
UCSO 

La Jolla, CA 92093 

Or. Ben SSneldarman 
Oept. of Computer Science 
University of Maryland 
College Park, HO 20742 

Orl^RoSal^SinsTe^Ter 
Carnegie-Mellon University 
Department of Psychology 
Schenley Park 
Pittsburgh, PA 15213 

Or. Herbert A. Simon 
Department of Psychology 
Carnegis-Mellon University 
SchenUy Park, 
Pittsburgh, PA 15213- 

LTCOL Robert Simpson 
Oefense Advanced Research 

Projects Administration 
1400 W1l$on Blvd. 
Arlington, VA 22209 

Or. H. Wallace Sinaiko 
Manpower Research 

and A^/isory Services 
Smithsonian Institution 
801 North Pitt Street 
Alexandria, VA 22314 

Or. Richard E. Snow 
Department of Psychology 
Stanford University 
Stanford, CA -94308 



Dr. Richard Sorensen 
Navy Personnel R&D Center 
San Oiego, CA 92152-6800 

Or. Xathryn T. Spoehr 
Brown University 
Oepftrtorent of Psychology 
Providence, RI 02912 

Dr. James J. Staszewski 
Research Associate 
Carnegie-Mellon University 
Department of Psychology 
Schenley Park 
Pittsburgh, PA 16213 

Dr. Robert Sternberg 
Department of Psychology 
Yale University 
Box 11A, Yale Station 
New Haven, CT 06520 



Dr. Kurt Steuck 
AFHRL/MOO 
Brooks AFB 

San Antonio TX 78235 

Dr. Paul J. Sticha 
Senior Staff Scientist 
Training Research Oivision 
HumRRO 

1100 S. Washington 
Alexandria, VA 22314 

Dr. John Tangney 
AFOSR/NL 

Boiling AFB, 0C 20332 

Or. Kikuai Tatsuoka 
CERL 

252 Engineering Research 

Laboratory 
Urbana, IL 61801 

Or. Perry W. Thorndyke 
FKC Corporation 
Central Engineering Labs 
1185 Coleman Avenue, Box 580 
Santa Clara, CA 95052 



76 



{OSK}<LXSPFXLF.S>QRIB03S0A.;2 10-Mar-87 09:27:48 Page 12 

198V03/09 

Distribution List LPittsburgh/Usgold] NR 4422539 



Or* Sharon Tkac2 
Amy Research Institute 
5001 Eisenhower Avenue 
Alexandria, VA 22333 

Or* Oouglas Towne 
Behavioral Technology Labs 
1845 S. Elena Ave* 
Redondo 8«ach 9 CA 90277 

Headquirters, U. S, Marina Corps 
Codt MPI-20 
Washington, OC 20330 

Or. William Uttal 
KOSC, Hawaii Lab 
Box 997 

Kailua, HI 95734 

Or* Kurt Van Inhn 
Department of Psychology 
Carnegie-Mellon University 
Schenley- Park 
Pittsburgh, PA 15213 

Or* 8«th War ran 
8olt Beranek & Newman, Inc* 
50 Moulton Street 
Qambridge, MA 02138 

Or* Kaith T* Wascourt 
FMC Corporation 
Ctntral Engineering Labs 
1185 Coleman Ave*, Box 580 
Santa Clara, CA 95052 

Or , Oouglas Wetzel 
Coda 12 

Navy Personnel R&O Center 
San Oiego, CA 92152-6800 

Or* 8arbara White 
Bolt 8eranek & Newman, Inc* 
10 Moulton Street 
Cambridge, MA. 02238 

Or* Christopher Wickens 
Department of Psychology 
University of Illinois 
Champaign, a 61820 



Or* Heather Wilt* 
Naval Air Development 

Center 
Code 6021 

Warminster, PA 18974-5000 

Or. Robert A. Wisher 

U.S. Army In' Itute for the 

Behavioral and Social Sciences 
5001 Eisenhower Avenue 
Alexandria, VA 22333 

Or. Martia F* Wiskoff 

Navy Personnel R&O Center 

San Oiego, CA 92152-6800 

or. Oan Wo? 2 
AFHRL/MOE 

Brooks AF8, TX 78235 

Or* Wallace Wulfeck, III 
Navy Personnel R&O Center 
San Oiego, CA 92152-6800 

Or* Joe Yasatuke 
AFHRL/LRT- 

Lcury AFB, CO 80230 

Or* Joseph L* Young 
Memory & Cognitive 

Processes 
National Science foundation 
Washington, Of ^<550 
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